Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

APE through neural and statistical MT with augmented data: ADAPT/DCU submission to the WMT 2019 APE Shared task

Shterionov, Dimitar orcid logoORCID: 0000-0001-6300-797X, Wagner, Joachim orcid logoORCID: 0000-0002-8290-3849 and do Carmo, Félix orcid logoORCID: 0000-0003-4193-3854 (2019) APE through neural and statistical MT with augmented data: ADAPT/DCU submission to the WMT 2019 APE Shared task. In: Fourth Conference on Machine Translation (WMT19), 01-02 Aug 2019, Florence, Italy.

Abstract
Automatic post-editing (APE) can be reduced to a machine translation (MT) task, where the source is the output of a specific MT system and the target is its post-edited variant. However, this approach does not consider context information that can be found in the original source of the MT system. Thus a better approach is to employ multi-source MT, where two input sequences are considered – the original source and the MT output. Extra context information can be introduced in the form of extra tokens that identify certain global properties of a group of segments, added as a prefix or a suffix to each segment. Successfully applied in domain adaptation of MT as well as on APE, this technique deserves further attention. In this work we investigate multi-source neural APE (or NPE) systems with training data which has been augmented with two types of extra context tokens. We experiment with authentic and synthetic data provided by WMT 2019 and submit our results to the APE shared task. We also experiment with using statistical machine translation (SMT) methods for APE. While our systems score bellow the baseline, we consider this work a step towards understanding the added value of extra context in the case of APE.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
DCU Faculties and Schools > Faculty of Humanities and Social Science > School of Applied Language and Intercultural Studies
Research Institutes and Centres > ADAPT
Published in: Bojar, Ondřej, Chatterjee, Rajen, Federmann, Christian and Fishel, Mark, (eds.) Proceedings of the Fourth Conference on Machine Translation: Shared Task Papers, Day 2). 3. Association for Computational Linguistics (ACL).
Publisher:Association for Computational Linguistics (ACL)
Official URL:http://dx.doi.org/10.18653/v1/W19-5415
Copyright Information:© 2019 ACL. CC-BY-4.0
Funders:Science Foundation Ireland (SFI) Research Centres Programme (Grant 13/RC/2106), European Regional Development Fund, European Unions Horizon 2020 under the EDGE COFUND Marie Skłodowska-Curie Grant Agreement no. 713567, Science Foundation Ireland (SFI) under Grant Number 13/RC/2077.
ID Code:23917
Deposited On:06 Dec 2019 09:59 by Joachim Wagner . Last Modified 08 Nov 2021 15:16
Documents

Full text available as:

[thumbnail of Final with proceedings mark and page numbers]
Preview
PDF (Final with proceedings mark and page numbers) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
208kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record