Shterionov, Dimitar ORCID: 0000-0001-6300-797X, do Carmo, Félix, Moorkens, Joss ORCID: 0000-0003-0766-0071, Hossari, Murhaf, Wagner, Joachim ORCID: 0000-0002-8290-3849, Paquin, Eric, Schmidtke, Dag, Groves, Declan and Way, Andy ORCID: 0000-0001-5736-5930 (2020) A roadmap to neural automatic post-editing: an empirical approach. Machine Translation (34). pp. 67-96. ISSN 0922-6567
Abstract
In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been developed and deployed in such workflows. With the advances in deep learning, neural APE (NPE) systems have outranked more traditional, statistical, ones. However, the plethora of options, variables and settings, as well as the relation between NPE performance and train/test data makes it difficult to select the most suitable approach for a given use case. In this article, we systematically analyse these different parameters with respect to NPE performance. We build an NPE “roadmap” to trace the different decision points and train a set of systems selecting different options through the roadmap. We also propose a novel approach for APE with data augmentation. We then analyse the performance of 15 of these systems and identify the best ones. In fact, the best systems are the ones that follow the newly-proposed method. The work presented in this article follows from a collaborative project between Microsoft and the ADAPT centre. The data provided by Microsoft originates from phrase-based statistical MT (PBSMT) systems employed in production. All tested NPE systems significantly increase the translation quality, proving the effectiveness of neural post-editing in the context of a commercial translation workflow that leverages PBSMT.
Metadata
Item Type: | Article (Published) |
---|---|
Refereed: | Yes |
Uncontrolled Keywords: | Automatic post-editing; Neural post-editing; Multi-source; Deep learning; Empirical evaluation; Machine Translation |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing DCU Faculties and Schools > Faculty of Humanities and Social Science > School of Applied Language and Intercultural Studies Research Institutes and Centres > ADAPT |
Publisher: | Springer |
Official URL: | http://dx.doi.org/10.1007%2Fs10590-020-09249-7 |
Copyright Information: | © 2020 The Authors CC-BY-4.0 (Open Access) |
Funders: | SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development, European Union’s Horizon 2020 research and innovation programme, under the EDGE COFUND Marie Skłodowska-Curie Grant Agreement No. 713567, Science Foundation Ireland (SFI) under Grant Number 13/RC/2077 |
ID Code: | 25315 |
Deposited On: | 06 Jan 2021 14:08 by Joss Moorkens . Last Modified 01 Mar 2023 13:33 |
Documents
Full text available as:
Preview |
PDF (Shterionov et al 2020)
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 3.0 943kB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record