Skip to main content
DORAS
DCU Online Research Access Service
Login (DCU Staff Only)
Combining translation memories and syntax-based SMT: experiments with real industrial data

Li, Liangyou ORCID: 0000-0002-0279-003X, Parra Escartín, Carla ORCID: 0000-0002-8412-1525 and Liu, Qun ORCID: 0000-0002-7000-1792 (2016) Combining translation memories and syntax-based SMT: experiments with real industrial data. Baltic Journal of Modern Computing, 4 (2). pp. 165-177. ISSN 2255-8942

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
247kB

Abstract

One major drawback of using Translation Memories (TMs) in phrase-based Machine Translation (MT) is that only continuous phrases are considered. In contrast, syntax-based MT allows phrasal discontinuity by learning translation rules containing non-terminals. In this paper, we combine a TM with syntax-based MT via sparse features. These features are extracted during decoding based on translation rules and their corresponding patterns in the TM. We have tested this approach by carrying out experiments on real English–Spanish industrial data. Our results show that these TM features significantly improve syntax-based MT. Our final system yields improvements of up to +3.1 BLEU, +1.6 METEOR, and -2.6 TER when compared with a stateof-the-art phrase-based MT system.

Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:translation memory; syntax-based SMT
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > ADAPT
Publisher:Latvijas Universitate
Official URL:https://www.bjmc.lu.lv/fileadmin/user_upload/lu_portal/projekti/bjmc/Contents/4_2_9_Li.pdf
Copyright Information:© 2016 Latvijas Universitate
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:People Programme (Marie Curie Actions) of the European Union’s Framework Programme (FP7/2007-2013) under REA grant agreement no 317471, The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund
ID Code:23307
Deposited On:16 May 2019 12:04 by Thomas Murtagh . Last Modified 16 May 2019 12:59

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

Altmetric
- Altmetric
+ Altmetric
  • Student Email
  • Staff Email
  • Student Apps
  • Staff Apps
  • Loop
  • Disclaimer
  • Privacy
  • Contact Us