Better training for function labeling

Chrupała, Grzegorz; Stroppa, Nicolas; van Genabith, Josef; Dinu, Georgiana

Chrupała, Grzegorz, Stroppa, Nicolas, van Genabith, Josef and Dinu, Georgiana (2007) Better training for function labeling. In: RANLP 2007 - Recent Advances in Natural Language Processing Conference, 27-29 September, 2007, Borovets, Bulgaria.

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

Function labels enrich constituency parse tree nodes with information about their abstract syntactic and semantic roles. A common way to obtain function-labeled trees is to use a two-stage architecture where first a statistical parser produces the constituent structure and then a second component such as a classifier adds the missing function tags. In order to achieve optimal results, training examples for machine-learning-based classifiers should be as similar as possible to the instances seen during prediction. However, the method which has been used so far to obtain training examples for the function labeling classifier suffers from a serious drawback: the training examples come from perfect treebank trees, whereas test examples are derived from parser-produced, imperfect trees. We show that extracting training instances from the reparsed training part of the treebank results in better training material as measured by similarity to test instances. We show that our training method achieves statistically significantly higher f-scores on the function labeling task for the English Penn Treebank. Currently our method achieves 91.47% f-score on the section 23 of WSJ, the highest score reported in the literature so far.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Conference
Refereed:	Yes
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	Research Institutes and Centres > National Centre for Language Technology (NCLT)
Official URL:	http://lml.bas.bg/ranlp2007/
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:	Science Foundation Ireland, SFI 04/IN/I527
ID Code:	15206
Deposited On:	17 Feb 2010 15:05 by DORAS Administrator . Last Modified 19 Jul 2018 14:50

Documents

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
203kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Better training for function labeling

Downloads