Treebank-based acquisition of wide-coverage, probabilistic LFG
resources: project overview, results and evaluation
Burke, Michael, Cahill, AoifeORCID: 0000-0002-3519-7726, O'Donovan, Ruth, van Genabith, Josef and Way, AndyORCID: 0000-0001-5736-5930
(2004)
Treebank-based acquisition of wide-coverage, probabilistic LFG
resources: project overview, results and evaluation.
In: IJCNLP-04 Workshop - The First International Joint Conference on Natural Language Processing, 21 March 2004, Sanya City, Hainan Island, China.
This paper presents an overview of a project to acquire wide-coverage, probabilistic Lexical-Functional Grammar
(LFG) resources from treebanks. Our approach is based on an automatic annotation algorithm that annotates “raw” treebank trees with LFG f-structure information approximating to basic predicate-argument/dependency structure. From the f-structure-annotated treebank
we extract probabilistic unification grammar resources. We present the annotation algorithm, the extraction of
lexical information and the acquisition of wide-coverage and robust PCFG-based LFG approximations including
long-distance dependency resolution.
We show how the methodology can be applied to multilingual, treebank-based unification grammar acquisition. Finally
we show how simple (quasi-)logical forms can be derived automatically from the f-structures generated for the treebank trees.