Capturing lexical variation in MT evaluation using automatically built sense-cluster inventories

Apidianaki, Marianna, He, Yifan and Way, Andy ORCID: 0000-0001-5736-5930 (2009) Capturing lexical variation in MT evaluation using automatically built sense-cluster inventories. In: PACLIC 23 - the 23rd Pacific Asia Conference on Language, Information and Computation, 3-5 December 2009, Hong Kong.

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

The strict character of most of the existing Machine Translation (MT) evaluation metrics does not permit them to capture lexical variation in translation. However, a central issue in MT evaluation is the high correlation that the metrics should have with human judgments of translation quality. In order to achieve a higher correlation, the identification of sense correspondences between the compared translations becomes really important. Given that most metrics are looking for exact correspondences, the evaluation results are often misleading concerning translation quality. Apart from that, existing metrics do not permit one to make a conclusive estimation of the impact of Word Sense Disambiguation techniques into MT systems. In this paper, we show how information acquired by an unsupervised semantic analysis method can be used to render MT evaluation more sensitive to lexical semantics. The sense inventories built by this data-driven method are incorporated into METEOR: they replace WordNet for evaluation in English and render METEOR’s synonymy module operable in French. The evaluation results demonstrate that the use of these inventories gives rise to an increase in the number of matches and the correlation with human judgments of translation quality, compared to precision-based metrics.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Conference
Refereed:	Yes
Uncontrolled Keywords:	evaluation; synonymy; sense clustering;
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	Research Institutes and Centres > Centre for Next Generation Localisation (CNGL) Research Institutes and Centres > National Centre for Language Technology (NCLT) DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Official URL:	http://paclic23.ctl.cityu.edu.hk/PACLIC23_index.ht...
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:	15176
Deposited On:	15 Feb 2010 14:59 by DORAS Administrator . Last Modified 14 Nov 2018 16:33

Documents

Full text available as:

[thumbnail of ApidianakiEtAl_paclic_09.pdf]

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
66kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Capturing lexical variation in MT evaluation using automatically built sense-cluster inventories

Downloads