Adapting WSJ-trained parsers to the British national corpus using in-domain self-training
Foster, Jennifer and Wagner, Joachim and Seddah, Djamé and van Genabith, Josef (2007) Adapting WSJ-trained parsers to the British national corpus using in-domain self-training. In: IWPT 2007 - 10th International Conference of Parsing Technology, 23-24 June 2007, Prague, Czech Republic.
Full text available as:
We introduce a set of 1,000 gold standard parse trees for the British National Corpus (BNC) and perform a series of self-training experiments with Charniak and Johnson’s
reranking parser and BNC sentences. We show that retraining this parser with a combination of one million BNC parse trees
(produced by the same parser) and the original WSJ training data yields improvements of 0.4% on WSJ Section 23 and 1.7% on the new BNC gold standard set.
Archive Staff Only: edit this record