Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

#hardtoparse: POS tagging and parsing the twitterverse

Foster, Jennifer orcid logoORCID: 0000-0002-7789-4853, Cetinoglu, Ozlem, Wagner, Joachim orcid logoORCID: 0000-0002-8290-3849, Le Roux, Joseph, Hogan, Stephen, Nivre, Joakim, Hogan, Deirdre and van Genabith, Josef orcid logoORCID: 0000-0003-1322-7944 (2011) #hardtoparse: POS tagging and parsing the twitterverse. In: The AAAI-11 Workshop on Analyzing Microtext, 8 Aug 2011, San Francisco, CA..

We evaluate the statistical dependency parser, Malt, on a new dataset of sentences taken from tweets. We use a version of Malt which is trained on gold standard phrase structure Wall Street Journal (WSJ) trees converted to Stanford labelled dependencies. We observe a drastic drop in performance moving from our in-domain WSJ test set to the new Twitter dataset, much of which has to do with the propagation of part-of-speech tagging errors. Retraining Malt on dependency trees produced by a state-of-the-art phrase structure parser, which has itself been self-trained on Twitter material, results in a significant improvement. We analyse this improvement by examining in detail the effect of the retraining on individual dependency types.
Item Type:Conference or Workshop Item (Paper)
Event Type:Workshop
Uncontrolled Keywords:Malt; Twitter
Subjects:Computer Science > Computational linguistics
Computer Science > Artificial intelligence
DCU Faculties and Centres:UNSPECIFIED
Copyright Information:© 2011 Association for the Advancement of Artificial Intelligence (www.aaai.org).
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Enterprise Ireland, Science Foundation Ireland, Centre for Next Generation Localisation, French Agence Nationale pour la Recherche
ID Code:16484
Deposited On:09 Aug 2011 11:55 by Joachim Wagner . Last Modified 19 Jan 2022 12:49

Full text available as:

[thumbnail of final draft post-refereeing]
PDF (final draft post-refereeing) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


Downloads per month over past year

Archive Staff Only: edit this record