Zhechev, Ventsislav (2010) Highlighting matched and mismatched segments in translation memory output through sub-tree alignment. In: the Translating and the Computer Conference 2010 (T&C ’10)., 18 - 19 November 2010, London, UK..
Abstract
In recent years, it is becoming more and more clear that the
localisation industry does not have the necessary manpower to satisfy the increasing demand for high-quality translation. This has fuelled the search new and existing technologies that would increase translator throughput. As Translation Memory (TM) systems are the most commonly employed tool by translators, a number of enhancements are
available to assist them in their job. One such enhancement would be to show the translator which parts of the sentence
that needs to be translated match which parts of the fuzzy
match suggested by the TM. For this information to be used,
however, the translators have to carry it over to the TM
translation themselves. In this paper, we present a novel methodology that can automatically detect and highlight
the segments that need to be modified in a TM-suggested
translation. We base it on state-of-the-art sub-tree align-
ment technology (Zhechev,2010) that can produce aligned
phrase-based-tree pairs from unannotated data. Our system
operates in a three-step process. First, the fuzzy match
selected by the TM and its translation are aligned. This
lets us know which segments of the source-language sentence
correspond to which segments in its translation. In the
second step, the fuzzy match is aligned to the input sentence that is currently being translated. This tells us
which parts of the input sentence are available in the fuzzy
match and which still need to be translated. In the third
step, the fuzzy match is used as an intermediary, through
which the alignments between the input sentence and the TM
translation are established. In this way, we can detect with
precision the segments in the suggested translation that the
translator needs to edit and highlight them appropriately to
set them apart from the segments that are already good translations for parts of the input sentence. Additionally,
we can show the alignments—as detected by our system—between
the input and the translation, which will make it even easier for the translator to post-edit the TM suggestion. This alignment information can additionally be used to pre-
translate the mismatched segments, further reducing the post-editing load.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | Translation Memory Systems; TM systems |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | Research Institutes and Centres > National Centre for Language Technology (NCLT) DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
ID Code: | 16020 |
Deposited On: | 01 Jun 2011 13:45 by Shane Harper . Last Modified 19 Jul 2018 14:52 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
637kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record