Cahill, Aoife ORCID: 0000-0002-3519-7726, McCarthy, Mairéad, van Genabith, Josef ORCID: 0000-0003-1322-7944 and Way, Andy ORCID: 0000-0001-5736-5930 (2002) Parsing with PCFGs and automatic f-structure annotation. In: LFG02 - 7th International Lexical Functional Grammar Conference, 3-5 July, 2002, Athens, Greece. ISBN 1098-6782
Abstract
The development of large coverage, rich unification- (constraint-) based grammar resources is very time consuming, expensive and requires lots of linguistic expertise. In this paper we report initial results on a new methodology that attempts to partially automate the development of substantial parts of large coverage, rich unification- (constraint-) based grammar resources. The method is based on a treebank resource (in our case Penn-II) and an automatic f-structure annotation algorithm that annotates treebank trees with proto-f-structure information. Based on these, we present two parsing architectures: in our pipeline architecture we first extract a PCFG from the treebank following the method of (Charniak,1996), use the PCFG to parse new text, automatically annotate the resulting trees with our f-structure annotation algorithm and generate proto-f-structures. By contrast, in the integrated architecture we first automatically annotate the treebank trees with f-structure information and then extract an annotated PCFG (A-PCFG) from the treebank. We then use the A-PCFG to parse new text to generate proto-f-structures. Currently our best parsers achieve more than 81% f-score on the 2400 trees in section 23 of the Penn-II treebank and more than 60% f-score on gold-standard proto-f-structures for 105 randomly selected trees from section 23.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | probabilistic context-free grammar; |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | Research Institutes and Centres > National Centre for Language Technology (NCLT) DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Published in: | Proceedings of the LFG 02 Conference. . CSLI Publications. ISBN 1098-6782 |
Publisher: | CSLI Publications |
Official URL: | http://csli-publications.stanford.edu/LFG/7/lfg02-... |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
Funders: | Enterprise Ireland, EI SC/2001/186 |
ID Code: | 15346 |
Deposited On: | 09 Apr 2010 10:32 by DORAS Administrator . Last Modified 21 Jan 2022 16:36 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
335kB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record