Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Investigating the relationship between classification quality and SMT performance in discriminative reordering models

Kazemi, Arefeh, Toral, Antonio orcid logoORCID: 0000-0003-2357-2960, Way, Andy orcid logoORCID: 0000-0001-5736-5930 and Monadjemi, Amirhassan (2017) Investigating the relationship between classification quality and SMT performance in discriminative reordering models. Entropy, 19 (9). pp. 1-17. ISSN 1099-4300

Abstract
Reordering is one of the most important factors affecting the quality of the output in statistical machine translation (SMT). A considerable number of approaches that proposed addressing the reordering problem are discriminative reordering models (DRM). The core component of the DRMs is a classifier which tries to predict the correct word order of the sentence. Unfortunately, the relationship between classification quality and ultimate SMT performance has not been investigated to date. Understanding this relationship will allow researchers to select the classifier that results in the best possible MT quality. It might be assumed that there is a monotonic relationship between classification quality and SMT performance, i.e., any improvement in classification performance will be monotonically reflected in overall SMT quality. In this paper, we experimentally show that this assumption does not always hold, i.e., an improvement in classification performance might actually degrade the quality of an SMT system, from the point of view of MT automatic evaluation metrics. However, we show that if the improvement in the classification performance is high enough, we can expect the SMT quality to improve as well. In addition to this, we show that there is a negative relationship between classification accuracy and SMT performance in imbalanced parallel corpora. For these types of corpora, we provide evidence that, for the evaluation of the classifier, macro-averaged metrics such as macro-averaged F-measure are better suited than accuracy, the metric commonly used to date.
Metadata
Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:statistical machine translation; reordering model; classification; performance; correlation; intrinsic evaluation
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Official URL:http://dx.doi.org/10.3390/e19090340
Copyright Information:© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:y University of Isfahan and by Science Foundation Ireland through the CNGL Programme (Grant 12/CE/I2267) in the ADAPT Centre (www.adaptcentre.ie) at Dublin City University, European Union Seventh Framework Programme FP7/2007-2013 under grant agreement PIAP-GA-2012-324414 (Abu-MaTran)., ADAPT Centre for Digital Content Technology at Dublin City University is funded under the Science Foundation Ireland Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.
ID Code:23311
Deposited On:17 May 2019 13:58 by Thomas Murtagh . Last Modified 08 Feb 2023 13:22
Documents

Full text available as:

[thumbnail of Investigating_the_Relationship_between_Classification_Quality_and_SMT_Performance_in_Discriminative_Reordering_Models[1].pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record