Bojar, OndřejORCID: 0000-0002-0606-0050, Graham, Yvette and Kamran, Amir
(2017)
Results of the WMT17 metrics shared task.
In: Second Conference on Machine Translation (WMT17), 7-8 Sept 2017, Copenhagen, Denmark.
This paper presents the results of the
WMT17 Metrics Shared Task. We asked
participants of this task to score the outputs of the MT systems involved in the
WMT17 news translation task and Neural MT training task. We collected scores
of 14 metrics from 8 research groups. In
addition to that, we computed scores of
7 standard metrics (BLEU, SentBLEU,
NIST, WER, PER, TER and CDER) as
baselines. The collected scores were evaluated in terms of system-level correlation
(how well each metric’s scores correlate
with WMT17 official manual ranking of
systems) and in terms of segment level
correlation (how often a metric agrees with
humans in judging the quality of a particular sentence).
This year, we build upon two types of
manual judgements: direct assessment
(DA) and HUME manual semantic judgements.
This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:
Grants H2020-ICT-2014-1-645442 (QT21), H2020-ICT2014-1-644402 (HimL), Dutch organization for scientific research STW grant nr. 12271, ADAPT Centre for Digital Content Technology (www.adaptcentre.ie) at Dublin City University funded under the SFI Research Centres Programme (Grant 13/RC/2106) co-funded under the European Regional Development Fund, Charles University Research Programme “Progres” Q18+Q48.
ID Code:
23367
Deposited On:
28 May 2019 10:14 by
Thomas Murtagh
. Last Modified 28 May 2019 10:14