Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Evaluating conjunction disambiguation on English-to-German and French-to-German WMT 2019 translation hypotheses

Popović, Maja orcid logoORCID: 0000-0001-8234-8745 (2019) Evaluating conjunction disambiguation on English-to-German and French-to-German WMT 2019 translation hypotheses. In: 4th Conference on Machine Translation (WMT 2019), 1 - 2 August 2019, Florence, Italy.

Abstract
We present a test set for evaluating an MT system’s capability to translate ambiguous conjunctions depending on the sentence structure. We concentrate on the English conjunction ”but” and its French equivalent ”mais” which can be translated into two different German conjunctions. We evaluate all English-to-German and French-to-German submissions to the WMT 2019 shared translation task. The evaluation is done mainly automatically, with additional fast manual inspection of unclear cases. All systems almost perfectly recognise the ta-get conjunction ”aber”, whereas accuracies fo rthe other target conjunction ”sondern” range from 78% to 97%, and the errors are mostly caused by replacing it with the alternative cojjunction ”aber”. The best performing system for both language pairs is a multilingual Transformer TartuNLP system trained on all WMT2019 language pairs which use the Latin script, indicating that the multilingual approach is beneficial for conjunction disambiguation. As for other system features, such as using synthetic back-translated data, context-aware, hybrid, etc., no particular (dis)advantages can be observed. Qualitative manual inspection of translation hypotheses shown that highly ranked systems generally produce translations with high adequacy and fluency, meaning that these systems are not only capable of capturing the right conjunction whereas the rest of the translation hypothesis is poor. On the other hand, the low ranked systems generally exhibit lower fluency and poor adequacy.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Published in: Proceedings of the Fourth Conference on Machine Translation: Shared Task Papers. 2. Association for Computational Linguistics (ACL).
Publisher:Association for Computational Linguistics (ACL)
Official URL:http://www.aclweb.org/anthology/W19-5353
Copyright Information:2019 The Author
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland (SFI)Research Centres Programme (Grant 13/RC/2106), European Regional Development Fund
ID Code:24481
Deposited On:25 May 2020 10:37 by Maja Popovic . Last Modified 25 May 2020 14:34
Documents

Full text available as:

[thumbnail of conj.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
90kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record