Minority language Twitter: part-of-speech tagging and analysis of Irish
Tweets
Lynn, Teresa, Scannell, Kevin and Maguire, Eimear
(2015)
Minority language Twitter: part-of-speech tagging and analysis of Irish
Tweets.
In: ACL 2015 Workshop on Noisy User-generated Text 2015 (W-NUT), 31 July 2015, Beijing, China.
Noisy user-generated text poses problems for natural language processing.
In this paper, we show that this statement also holds true for the Irish
language. Irish is regarded as a low-resourced language, with limited
annotated corpora available to NLP researchers and linguists to fully
analyse the linguistic patterns in language use in social media. We
contribute to recent advances in this area of research by reporting on the
development of part-of speech annotation scheme and annotated corpus for
Irish language tweets. We also report on state-of-the-art tagging results of
training and testing three existing POStaggers on our new dataset.
This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:
Fulbright Commision of Ireland (Fulbright Enterprise-Ireland Award 2014-2015), Science Foundation Ireland through the CNGL Programme (Grant 12/CE/I2267) in the ADAPT Centre (www.adaptcentre.ie) at Dublin City University, The second author was partially supported by US NSF grant 1159174
ID Code:
23604
Deposited On:
30 Jul 2019 09:31 by
Thomas Murtagh
. Last Modified 30 Jul 2019 09:31