Kazemi, Arefeh, Toral, Antonio ORCID: 0000-0003-2357-2960, Way, Andy ORCID: 0000-0001-5736-5930 and Monadjemi, Amirhassan (2017) Investigating the relationship between classification quality and SMT performance in discriminative reordering models. Entropy, 19 (9). pp. 1-17. ISSN 1099-4300
Abstract
Reordering is one of the most important factors affecting the quality of the output in
statistical machine translation (SMT). A considerable number of approaches that proposed addressing
the reordering problem are discriminative reordering models (DRM). The core component of the
DRMs is a classifier which tries to predict the correct word order of the sentence. Unfortunately,
the relationship between classification quality and ultimate SMT performance has not been
investigated to date. Understanding this relationship will allow researchers to select the classifier that
results in the best possible MT quality. It might be assumed that there is a monotonic relationship
between classification quality and SMT performance, i.e., any improvement in classification
performance will be monotonically reflected in overall SMT quality. In this paper, we experimentally
show that this assumption does not always hold, i.e., an improvement in classification performance
might actually degrade the quality of an SMT system, from the point of view of MT automatic
evaluation metrics. However, we show that if the improvement in the classification performance is
high enough, we can expect the SMT quality to improve as well. In addition to this, we show that
there is a negative relationship between classification accuracy and SMT performance in imbalanced
parallel corpora. For these types of corpora, we provide evidence that, for the evaluation of the
classifier, macro-averaged metrics such as macro-averaged F-measure are better suited than accuracy,
the metric commonly used to date.
Metadata
Item Type: | Article (Published) |
---|---|
Refereed: | Yes |
Uncontrolled Keywords: | statistical machine translation; reordering model; classification; performance; correlation; intrinsic evaluation |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT |
Official URL: | http://dx.doi.org/10.3390/e19090340 |
Copyright Information: | © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
Funders: | y University of Isfahan and by Science Foundation Ireland through the CNGL Programme (Grant 12/CE/I2267) in the ADAPT Centre (www.adaptcentre.ie) at Dublin City University, European Union Seventh Framework Programme FP7/2007-2013 under grant agreement PIAP-GA-2012-324414 (Abu-MaTran)., ADAPT Centre for Digital Content Technology at Dublin City University is funded under the Science Foundation Ireland Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund. |
ID Code: | 23311 |
Deposited On: | 17 May 2019 13:58 by Thomas Murtagh . Last Modified 08 Feb 2023 13:22 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record