Domain adaptation for social localisation-based
SMT: a Case study using the Trommons platform
Du, JinhuaORCID: 0000-0002-3267-4881, Way, AndyORCID: 0000-0001-5736-5930, Qui, Zhengwei, Wasala, Asanka and Schäler, Reinhard
(2015)
Domain adaptation for social localisation-based
SMT: a Case study using the Trommons platform.
In: MT Summit Workshop on Post-Editing Technology and Practice (WPTP4) as part of Machine Translation Summit XV, 3 Oct-3 Nov, 2015, Miami, FL, USA.
Social localisation is a kind of community action, which matches communities and the content
they need, and supports their localisation efforts. The goal of social localisation-based statistical machine translation (SL-SMT) is to support and bridge global communities exchanging
any type of digital content across different languages and cultures. Trommons is an open
platform maintained by The Rosetta Foundation to connect non-profit translation projects and
organisations with the skills and interests of volunteer translators, where they can translate,
post-edit or proofread different types of documents. Using Trommons as the experimental
platform, this paper focuses on domain adaptation techniques to augment SL-SMT to facilitate
translators/post-editors. Specifically, the Cross Entropy Difference algorithm is used to adapt
Europarl data to the social localisation data. Experimental results on English–Spanish show
that the domain adaptation techniques can significantly improve translation performance by
6.82 absolute BLEU points and 5.99 absolute TER points compared to the baseline.
O'Brien, Sharon and Simard, Michel, (eds.)
Proceedings of 4th Workshop on Post-Editing Technology and Practice (WPTP4).
.
Association for Machine Translation in the Americas (AMTA).
Publisher:
Association for Machine Translation in the Americas (AMTA)
This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:
Science Foundation Ireland through the ADAPT Centre (Grant 13/RC/2106) at Dublin City University, Grant 610879 for the Falcon project funded by the European Commission.
ID Code:
23228
Deposited On:
02 May 2019 08:33 by
Thomas Murtagh
. Last Modified 28 Aug 2020 13:25