Highlighting matched and mismatched segments in translation
memory output through sub-­tree alignment

Zhechev, Ventsislav

Zhechev, Ventsislav (2010) Highlighting matched and mismatched segments in translation memory output through sub-tree alignment. In: the Translating and the Computer Conference 2010 (T&C ’10)., 18 - 19 November 2010, London, UK..

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

In recent years, it is becoming more and more clear that the localisation industry does not have the necessary manpower to satisfy the increasing demand for high-quality translation. This has fuelled the search new and existing technologies that would increase translator throughput. As Translation Memory (TM) systems are the most commonly employed tool by translators, a number of enhancements are available to assist them in their job. One such enhancement would be to show the translator which parts of the sentence that needs to be translated match which parts of the fuzzy match suggested by the TM. For this information to be used, however, the translators have to carry it over to the TM translation themselves. In this paper, we present a novel methodology that can automatically detect and highlight the segments that need to be modified in a TM-suggested translation. We base it on state-of-the-art sub-tree align- ment technology (Zhechev,2010) that can produce aligned phrase-based-tree pairs from unannotated data. Our system operates in a three-step process. First, the fuzzy match selected by the TM and its translation are aligned. This lets us know which segments of the source-language sentence correspond to which segments in its translation. In the second step, the fuzzy match is aligned to the input sentence that is currently being translated. This tells us which parts of the input sentence are available in the fuzzy match and which still need to be translated. In the third step, the fuzzy match is used as an intermediary, through which the alignments between the input sentence and the TM translation are established. In this way, we can detect with precision the segments in the suggested translation that the translator needs to edit and highlight them appropriately to set them apart from the segments that are already good translations for parts of the input sentence. Additionally, we can show the alignments—as detected by our system—between the input and the translation, which will make it even easier for the translator to post-edit the TM suggestion. This alignment information can additionally be used to pre- translate the mismatched segments, further reducing the post-editing load.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Conference
Refereed:	Yes
Uncontrolled Keywords:	Translation Memory Systems; TM systems
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	Research Institutes and Centres > National Centre for Language Technology (NCLT) DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:	16020
Deposited On:	01 Jun 2011 13:45 by Shane Harper . Last Modified 19 Jul 2018 14:52

Documents

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
637kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Highlighting matched and mismatched segments in translation memory output through sub-­tree alignment

Downloads

Highlighting matched and mismatched segments in translation memory output through sub-tree alignment