Foster, Jennifer ORCID: 0000-0002-7789-4853, Cetinoglu, Ozlem, Wagner, Joachim ORCID: 0000-0002-8290-3849, Le Roux, Joseph, Hogan, Stephen, Nivre, Joakim, Hogan, Deirdre and van Genabith, Josef ORCID: 0000-0003-1322-7944 (2011) #hardtoparse: POS tagging and parsing the twitterverse. In: The AAAI-11 Workshop on Analyzing Microtext, 8 Aug 2011, San Francisco, CA..
Abstract
We evaluate the statistical dependency parser, Malt, on a new dataset of sentences taken from tweets. We use a version of Malt which is trained on gold standard phrase structure Wall Street Journal (WSJ) trees converted to Stanford labelled dependencies. We observe a drastic drop in performance moving from our in-domain WSJ test set to the new Twitter dataset, much of which has to do with the propagation of part-of-speech tagging errors. Retraining Malt on dependency trees produced by a state-of-the-art phrase structure parser, which has itself been self-trained on Twitter material, results in a significant improvement. We analyse this improvement by examining in detail the effect of the retraining on individual dependency types.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Workshop |
Refereed: | Yes |
Uncontrolled Keywords: | Malt; Twitter |
Subjects: | Computer Science > Computational linguistics Computer Science > Artificial intelligence |
DCU Faculties and Centres: | UNSPECIFIED |
Copyright Information: | © 2011 Association for the Advancement of Artificial Intelligence (www.aaai.org). |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
Funders: | Enterprise Ireland, Science Foundation Ireland, Centre for Next Generation Localisation, French Agence Nationale pour la Recherche |
ID Code: | 16484 |
Deposited On: | 09 Aug 2011 11:55 by Joachim Wagner . Last Modified 19 Jan 2022 12:49 |
Documents
Full text available as:
Preview |
PDF (final draft post-refereeing)
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
175kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record