Browse DORAS
Browse Theses
Search
Latest Additions
Creative Commons License
Except where otherwise noted, content on this site is licensed for use under a:

Capturing translational divergences with a statistical tree-to-tree aligner

Hearne, Mary and Tinsley, John and Zhechev, Ventsislav and Way, Andy (2007) Capturing translational divergences with a statistical tree-to-tree aligner. In: TMI-07 - Proceedings of The 11th Conference on Theoretical and Methodological Issues in Machine Translation, 7-9 September 2007, Skövde, Sweden.

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
164Kb

Abstract

Parallel treebanks, which comprise paired source-target parse trees aligned at sub-sentential level, could be useful for many applications, particularly data-driven machine translation. In this paper, we focus on how translational divergences are captured within a parallel treebank using a fully automatic statistical tree-to-tree aligner. We observe that while the algorithm performs well at the phrase level, performance on lexical-level alignments is compromised by an inappropriate bias towards coverage rather than precision. This preference for high precision rather than broad coverage in terms of expressing translational divergences through tree-alignment stands in direct opposition to the situation for SMT word-alignment models. We suggest that this has implications not only for tree-alignment itself but also for the broader area of induction of syntaxaware models for SMT.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:statistical machine translation;
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:Research Initiatives and Centres > National Centre for Language Technology (NCLT)
Official URL:http://www.computing.dcu.ie/~away/TMI-07/
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland, SFI 05/RF/CMS064
ID Code:15223
Deposited On:18 Feb 2010 11:55 by DORAS Administrator. Last Modified 27 Apr 2010 14:37

Download statistics

Archive Staff Only: edit this record