Browse DORAS
Browse Theses
Search
Latest Additions
Creative Commons License
Except where otherwise noted, content on this site is licensed for use under a:

Two approaches to automatic matching of atomic grammatical features in LFG

Bryl, Anton and van Genabith, Josef (2010) Two approaches to automatic matching of atomic grammatical features in LFG. In: LFG10 Conference, 18-20 July 2010, Ottowa, Canada.

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
170Kb

Abstract

The alignment of a bilingual corpus is an important step in data preparation for data-driven machine translation. LFG f-structures provide bilexical labelled dependencies in the form of lemmas and core grammatical functions linking those lemmas, but also important grammatical features (TENSE, NUMBER, CASE, etc.) representing morphological and semantic information. These grammatical features can often be translated independently from the lemmas or words. It is therefore of practical interest to develop methods that align grammatical features which can be considered translations of each other (e.g. the number features of the corresponding words in the source and target parts of the corpus) in data-driven LFG-based MT. In a parallel grammar development scenario, such as ParGram, this is to a large extent captured through manually hardcoding the correspondences in the hand-crafted grammars, using similar or identical feature names for similar phenomena across languages. However, for a completely automatic learning method it is desirable to establish these correspondences without human assistance. In this paper we present and evaluate two approaches to the automatic identification of correspondences between atomic features of LFG (and similar) grammars for different languages. The methods can be used to evaluate the correspondence between feature names in hand-crafted parallel grammars or find correspondences between features in grammars for different languages where feature alignments are not known.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:LFG Grammars
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:Research Initiatives and Centres > Centre for Next Generation Localisation (CNGL)
Research Initiatives and Centres > National Centre for Language Technology (NCLT)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Published in:Proceedings of the LFG10 Conference. . CSLI Publications.
Publisher:CSLI Publications
Official URL:http://cslipublications.stanford.edu/LFG/15/papers/lfg10brylvangenabith.pdf
Copyright Information:© 2010 CSLI Publications.
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:16016
Deposited On:19 May 2011 10:36 by Shane Harper. Last Modified 19 May 2011 10:36

Download statistics

Archive Staff Only: edit this record