Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Automatic annotation of the Penn-treebank with LFG f-structure information

Cahill, Aoife orcid logoORCID: 0000-0002-3519-7726, McCarthy, Mairéad, van Genabith, Josef orcid logoORCID: 0000-0003-1322-7944 and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2002) Automatic annotation of the Penn-treebank with LFG f-structure information. In: LREC 2002 Workshop on Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Language Data, 1 June 2002, Las Palmas, Canary Islands.

Abstract
Lexical-Functional Grammar f-structures are abstract syntactic representations approximating basic predicate-argument structure. Treebanks annotated with f-structure information are required as training resources for stochastic versions of unification and constraint-based grammars and for the automatic extraction of such resources. In a number of papers (Frank, 2000; Sadler, van Genabith and Way, 2000) have developed methods for automatically annotating treebank resources with f-structure information. However, to date, these methods have only been applied to treebank fragments of the order of a few hundred trees. In the present paper we present a new method that scales and has been applied to a complete treebank, in our case the WSJ section of Penn-II (Marcus et al, 1994), with more than 1,000,000 words in about 50,000 sentences.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Workshop
Refereed:No
Uncontrolled Keywords:lexical-functional grammar;
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:Research Institutes and Centres > National Centre for Language Technology (NCLT)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Official URL:http://www.lrec-conf.org/lrec2002/lrec/wksh/Bootst...
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Enterprise Ireland, EI SC/2001/186
ID Code:15321
Deposited On:18 Mar 2010 16:10 by DORAS Administrator . Last Modified 21 Jan 2022 16:36
Documents

Full text available as:

[thumbnail of cahill_et_al_02a.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
70kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record