Automatic annotation of the Penn-treebank with LFG f-structure information
Cahill, Aoife and McCarthy, Mairéad and van Genabith, Josef and Way, Andy (2002) Automatic annotation of the Penn-treebank with LFG f-structure information. In: LREC 2002 Workshop on Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Language Data, 1 June 2002, Las Palmas, Canary Islands.
Full text available as:
Lexical-Functional Grammar f-structures are abstract syntactic representations approximating basic predicate-argument structure. Treebanks annotated with f-structure information are required as training resources for stochastic versions of unification and constraint-based
grammars and for the automatic extraction of such resources. In a number of papers (Frank, 2000; Sadler, van Genabith and Way, 2000) have developed methods for automatically annotating treebank resources with f-structure information. However, to date, these methods
have only been applied to treebank fragments of the order of a few hundred trees. In the present paper we present a new method that scales and has been applied to a complete treebank, in our case the WSJ section of Penn-II (Marcus et al, 1994), with more than 1,000,000 words in about 50,000 sentences.
Archive Staff Only: edit this record