Browse DORAS
Browse Theses
Search
Latest Additions
Creative Commons License
Except where otherwise noted, content on this site is licensed for use under a:

Treebank annotation schemes and parser evaluation for German

Rehbein, Ines and van Genabith, Josef (2007) Treebank annotation schemes and parser evaluation for German. In: EMNLP-CoNLL 2007 - Joint Meeting of the Conference on Empirical Methods in Natural Language Processing and the Conference on Computational Natural Language Learning, 28-30 June 2007, Prague, Czech Republic.

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
396Kb

Abstract

Recent studies focussed on the question whether less-congurational languages like German are harder to parse than English, or whether the lower parsing scores are an artefact of treebank encoding schemes and data structures, as claimed by K¨ubler et al. (2006). This claim is based on the assumption that PARSEVAL metrics fully reflect parse quality across treebank encoding schemes. In this paper we present new experiments to test this claim. We use the PARSEVAL metric, the Leaf-Ancestor metric as well as a dependency-based evaluation, and present novel approaches measuring the effect of controlled error insertion on treebank trees and parser output. We also provide extensive past-parsing crosstreebank conversion. The results of the experiments show that, contrary to K¨ubler et al. (2006), the question whether or not German is harder to parse than English remains undecided.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:parser evaluation;
Subjects:Computer Science > Machine translating
Humanities > German language
DCU Faculties and Centres:Research Initiatives and Centres > National Centre for Language Technology (NCLT)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Publisher:Association for Computational Linguistics
Official URL:http://www.aclweb.org/anthology/D/D07/
Copyright Information:© 2007 Association for Computational Linguistics
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland, SFI 04/IN/I527
ID Code:15213
Deposited On:17 Feb 2010 16:49 by DORAS Administrator. Last Modified 27 Apr 2010 14:59

Download statistics

Archive Staff Only: edit this record