Treebank-based multilingual unification-grammar development
Cahill, Aoife and Forst, Martin and McCarthy, Mairéad and O'Donovan, Ruth and Rohrer, Christian and van Genabith, Josef and Way, Andy (2003) Treebank-based multilingual unification-grammar development. In: ESSLLI 2003 - 15th European Summer School in Logic Language and Information, 18-19 August 2003, Vienna, Austria.
Full text available as:
Broad-coverage, deep unification grammar development is time-consuming and costly. This problem can be exacerbated
in multilingual grammar development scenarios. Recently (Cahill et al., 2002) presented a treebank-based methodology
to semi-automatically create broadcoverage, deep, unification grammar resources for English. In this paper we
present a project which adapts this model to a multilingual grammar development scenario to obtain robust, wide-coverage, probabilistic Lexical-Functional Grammars
(LFGs) for English and German via automatic f-structure annotation algorithms based on the Penn-II and TIGER
treebanks. We outline our method used to extract a probabilistic LFG from the TIGER treebank and report on the quality of the f-structures produced. We achieve an f-score of 66.23 on the evaluation of 100 random sentences against a manually constructed gold standard.
Archive Staff Only: edit this record