Robust language pair-independent sub-tree alignment
Tinsley, John and Zhechev, Ventsislav and Hearne, Mary and Way, Andy (2007) Robust language pair-independent sub-tree alignment. In: Machine Translation Summit XI, 10-14 September, 2007, Copenhagen, Denmark.
Full text available as:
Data-driven approaches to machine translation (MT) achieve state-of-the-art results. Many syntax-aware approaches, such as Example-Based MT and Data-Oriented Translation, make use of tree pairs aligned at sub-sentential level. Obtaining sub-sentential alignments manually is time-consuming and error-prone, and requires expert knowledge of both source and target languages. We propose a novel, language pair-independent algorithm which automatically induces alignments between phrase-structure trees. We evaluate the alignments themselves against a manually aligned gold standard, and perform an extrinsic evaluation by using the aligned data to train and test a DOT system. Our results show that translation accuracy is comparable to that of the same translation system trained on manually aligned data, and coverage improves.
Archive Staff Only: edit this record