Burke, Michael (2006) Automatic treebank annotation for the acquisition of LFG resources. PhD thesis, Dublin City University.
Abstract
Traditionally, rich, constraint-based grammatical resources have been hand-coded. Scaling wide-coverage, deep, constraint-based grammars such as Lexical-Functional Grammars from fragments to naturally occurring unrestricted text is knowledge-intensive, timeconsuming and (often prohibitively) expensive.
Based on earlier work by McCarthy (2003), this thesis presents the development and evaluation of an automatic LFG f-structure annotation algorithm which is the core component in a larger project on rapid, wide-coverage, deep, constraint-based, multilingual grammar acquisition, addressing the knowledge acquisition bottleneck familiar from traditional rule-based approaches to NLP and AI. The algorithm annotates the Penn-II treebank with LFG f-structure information. Grammars and lexical resources are then extracted from the f-structure annotated treebank. Extensive evaluation of the annotation algorithm against independently constructed gold-standards (PARC 700 Dependency Bank and Propbank) shows the quality of the f-structures acquired.
The methodology developed in this thesis has been deployed for multilingual, rapid grammar development: grammars and lexical resources for Mandarin Chinese were acquired from the Penn Chinese Treebank (CTB) using a generic version of the annotation algorithm, seeded with linguistic generalisations for Mandarin Chinese.
Metadata
Item Type: | Thesis (PhD) |
---|---|
Date of Award: | 2006 |
Refereed: | No |
Supervisor(s): | van Genabith, Josef and Way, Andy |
Uncontrolled Keywords: | Lexical-Functional Grammars; LFG; multilingual grammar acquisition |
Subjects: | Computer Science > Machine translating Humanities > Linguistics |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 3.0 License. View License |
ID Code: | 17357 |
Deposited On: | 31 Aug 2012 10:25 by Fran Callaghan . Last Modified 19 Jul 2018 14:57 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
3MB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record