Automatic treebank annotation for the acquisition of LFG resources

Burke, Michael

Burke, Michael (2006) Automatic treebank annotation for the acquisition of LFG resources. PhD thesis, Dublin City University.

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

Traditionally, rich, constraint-based grammatical resources have been hand-coded. Scaling wide-coverage, deep, constraint-based grammars such as Lexical-Functional Grammars from fragments to naturally occurring unrestricted text is knowledge-intensive, timeconsuming and (often prohibitively) expensive. Based on earlier work by McCarthy (2003), this thesis presents the development and evaluation of an automatic LFG f-structure annotation algorithm which is the core component in a larger project on rapid, wide-coverage, deep, constraint-based, multilingual grammar acquisition, addressing the knowledge acquisition bottleneck familiar from traditional rule-based approaches to NLP and AI. The algorithm annotates the Penn-II treebank with LFG f-structure information. Grammars and lexical resources are then extracted from the f-structure annotated treebank. Extensive evaluation of the annotation algorithm against independently constructed gold-standards (PARC 700 Dependency Bank and Propbank) shows the quality of the f-structures acquired. The methodology developed in this thesis has been deployed for multilingual, rapid grammar development: grammars and lexical resources for Mandarin Chinese were acquired from the Penn Chinese Treebank (CTB) using a generic version of the annotation algorithm, seeded with linguistic generalisations for Mandarin Chinese.

Metadata

Item Type:	Thesis (PhD)
Date of Award:	2006
Refereed:	No
Supervisor(s):	van Genabith, Josef and Way, Andy
Uncontrolled Keywords:	Lexical-Functional Grammars; LFG; multilingual grammar acquisition
Subjects:	Computer Science > Machine translating Humanities > Linguistics
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 3.0 License. View License
ID Code:	17357
Deposited On:	31 Aug 2012 10:25 by INVALID USER. Last Modified 19 Jul 2018 14:57

Documents

Full text available as:

[thumbnail of michael_burke_20120704145355.pdf]

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
3MB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Automatic treebank annotation for the acquisition of LFG resources

Downloads