A set of recommendations for assessing human-machine parity in language translation

Läubli, Samuel; Castilho, Sheila; Neubig, Graham; Sennrich, Rico; Shen, Qinlan; Toral, Antonio

Läubli, Samuel ORCID: 0000-0001-5362-4106, Castilho, Sheila ORCID: 0000-0002-8416-6555, Neubig, Graham, Sennrich, Rico ORCID: 0000-0002-1438-4741, Shen, Qinlan and Toral, Antonio ORCID: 0000-0003-2357-2960 (2020) A set of recommendations for assessing human-machine parity in language translation. Journal of Artificial Intelligence Research, 67 . pp. 653-672. ISSN 1076-9757

Abstract
Metadata
Downloads
Documents
Metrics

[+][-]

Abstract

The quality of machine translation has increased remarkably over the past years, to the degree that it was found to be indistinguishable from professional human translation in a number of empirical investigations. We reassess Hassan et al.’s 2018 investigation into Chinese to English news translation, showing that the finding of human–machine parity was owed to weaknesses in the evaluation design—which is currently considered best practice in the field. We show that the professional human translations contained significantly fewer errors, and that perceived quality in human evaluation depends on the choice of raters, the availability of linguistic context, and the creation of reference translations. Our results call for revisiting current best practices to assess strong machine translation systems in general and human–machine parity in particular, for which we offer a set of recommendations based on our empirical findings.

Metadata

Item Type:	Article (Published)
Refereed:	Yes
Uncontrolled Keywords:	human evaluation of machine translation
Subjects:	Computer Science > Machine translating Humanities > Translating and interpreting
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT
Publisher:	AI Access Foundation
Official URL:	http://dx.doi.org/10.1613/jair.1.11371
Copyright Information:	© 2020 AI Access Foundation
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:	24536
Deposited On:	04 Jun 2020 13:00 by Sheila Castilho . Last Modified 20 Jan 2021 16:51

Documents

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
353kB

Metrics

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

A set of recommendations for assessing human-machine parity in language translation

Altmetric Badge

Dimensions Badge

Downloads