Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Combining translation memories and syntax-based SMT: experiments with real industrial data

Li, Liangyou orcid logoORCID: 0000-0002-0279-003X, Parra Escartín, Carla orcid logoORCID: 0000-0002-8412-1525 and Liu, Qun orcid logoORCID: 0000-0002-7000-1792 (2016) Combining translation memories and syntax-based SMT: experiments with real industrial data. Baltic Journal of Modern Computing, 4 (2). pp. 165-177. ISSN 2255-8942

Abstract
One major drawback of using Translation Memories (TMs) in phrase-based Machine Translation (MT) is that only continuous phrases are considered. In contrast, syntax-based MT allows phrasal discontinuity by learning translation rules containing non-terminals. In this paper, we combine a TM with syntax-based MT via sparse features. These features are extracted during decoding based on translation rules and their corresponding patterns in the TM. We have tested this approach by carrying out experiments on real English–Spanish industrial data. Our results show that these TM features significantly improve syntax-based MT. Our final system yields improvements of up to +3.1 BLEU, +1.6 METEOR, and -2.6 TER when compared with a stateof-the-art phrase-based MT system.
Metadata
Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:translation memory; syntax-based SMT
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Publisher:Latvijas Universitate
Official URL:https://www.bjmc.lu.lv/fileadmin/user_upload/lu_po...
Copyright Information:© 2016 Latvijas Universitate
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:People Programme (Marie Curie Actions) of the European Union’s Framework Programme (FP7/2007-2013) under REA grant agreement no 317471, The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund
ID Code:23307
Deposited On:16 May 2019 12:04 by Thomas Murtagh . Last Modified 16 May 2019 12:59
Documents

Full text available as:

[thumbnail of Combining_Translation_Memories_and_Syntax-Based_SMT[1].pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
247kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record