Adapting a WSJ-trained parser to grammatically noisy text
Foster, Jennifer and Wagner, Joachim and van Genabith, Josef (2008) Adapting a WSJ-trained parser to grammatically noisy text. In: ACL-08:HLT - 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 15-20 June 2008, Columbus, USA.
Full text available as:
We present a robust parser which is trained on a treebank of ungrammatical sentences. The treebank is created automatically by modifying Penn treebank sentences so that they contain one or more syntactic errors. We evaluate
an existing Penn-treebank-trained parser on the ungrammatical treebank to see how it reacts to noise in the form of grammatical errors. We re-train this parser on the training section of the ungrammatical treebank, leading
to an significantly improved performance on the ungrammatical test sets. We show how a classifier can be used to prevent performance degradation on the original grammatical data.
Archive Staff Only: edit this record