Combining translation memories and syntax-based SMT:
experiments with real industrial data

Li, Liangyou; Parra Escartín, Carla; Liu, Qun

Li, Liangyou ORCID: 0000-0002-0279-003X, Parra Escartín, Carla ORCID: 0000-0002-8412-1525 and Liu, Qun ORCID: 0000-0002-7000-1792 (2016) Combining translation memories and syntax-based SMT: experiments with real industrial data. Baltic Journal of Modern Computing, 4 (2). pp. 165-177. ISSN 2255-8942

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

One major drawback of using Translation Memories (TMs) in phrase-based Machine Translation (MT) is that only continuous phrases are considered. In contrast, syntax-based MT allows phrasal discontinuity by learning translation rules containing non-terminals. In this paper, we combine a TM with syntax-based MT via sparse features. These features are extracted during decoding based on translation rules and their corresponding patterns in the TM. We have tested this approach by carrying out experiments on real English–Spanish industrial data. Our results show that these TM features significantly improve syntax-based MT. Our final system yields improvements of up to +3.1 BLEU, +1.6 METEOR, and -2.6 TER when compared with a stateof-the-art phrase-based MT system.

Metadata

Item Type:	Article (Published)
Refereed:	Yes
Uncontrolled Keywords:	translation memory; syntax-based SMT
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT
Publisher:	Latvijas Universitate
Official URL:	https://www.bjmc.lu.lv/fileadmin/user_upload/lu_po...
Copyright Information:	© 2016 Latvijas Universitate
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:	People Programme (Marie Curie Actions) of the European Union’s Framework Programme (FP7/2007-2013) under REA grant agreement no 317471, The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund
ID Code:	23307
Deposited On:	16 May 2019 12:04 by INVALID USER. Last Modified 16 May 2019 12:59

Documents

Full text available as:

[thumbnail of Combining_Translation_Memories_and_Syntax-Based_SMT[1].pdf]

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
247kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Combining translation memories and syntax-based SMT: experiments with real industrial data

Downloads