Long-distance dependency resolution in automatically acquired
wide-coverage PCFG-based LFG approximations
Cahill, AoifeORCID: 0000-0002-3519-7726, Burke, Michael, O'Donovan, Ruth, van Genabith, Josef and Way, AndyORCID: 0000-0001-5736-5930
(2004)
Long-distance dependency resolution in automatically acquired
wide-coverage PCFG-based LFG approximations.
In: ACL 2004 - 42nd Annual Meeting of the Association for Computational Linguistics, 21-26 July 2004, Barcelona, Spain.
This paper shows how finite approximations of long distance dependency (LDD) resolution can be obtained automatically for wide-coverage, robust, probabilistic Lexical-Functional Grammar (LFG) resources acquired from treebanks. We extract LFG subcategorisation frames and paths linking LDD
reentrancies from f-structures generated automatically
for the Penn-II treebank trees and use them in an LDD resolution algorithm to parse new text. Unlike (Collins, 1999; Johnson, 2002), in our approach resolution of LDDs is done at f-structure (attribute-value structure representations of basic predicate-argument or dependency structure) without empty productions, traces and coindexation in CFG parse trees. Currently our best automatically induced grammars achieve 80.97% f-score for fstructures parsing section 23 of the WSJ part of the
Penn-II treebank and evaluating against the DCU 1051 and 80.24% against the PARC 700 Dependency Bank (King et al., 2003), performing at the same or a slightly better level than state-of-the-art hand-crafted grammars (Kaplan et al., 2004).