Reordering is one of the most important factors affecting the quality of the output in
statistical machine translation (SMT). A considerable number of approaches that proposed addressing
the reordering problem are discriminative reordering models (DRM). The core component of the
DRMs is a classifier which tries to predict the correct word order of the sentence. Unfortunately,
the relationship between classification quality and ultimate SMT performance has not been
investigated to date. Understanding this relationship will allow researchers to select the classifier that
results in the best possible MT quality. It might be assumed that there is a monotonic relationship
between classification quality and SMT performance, i.e., any improvement in classification
performance will be monotonically reflected in overall SMT quality. In this paper, we experimentally
show that this assumption does not always hold, i.e., an improvement in classification performance
might actually degrade the quality of an SMT system, from the point of view of MT automatic
evaluation metrics. However, we show that if the improvement in the classification performance is
high enough, we can expect the SMT quality to improve as well. In addition to this, we show that
there is a negative relationship between classification accuracy and SMT performance in imbalanced
parallel corpora. For these types of corpora, we provide evidence that, for the evaluation of the
classifier, macro-averaged metrics such as macro-averaged F-measure are better suited than accuracy,
the metric commonly used to date.
This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:
y University of Isfahan and by Science Foundation Ireland through the CNGL Programme (Grant 12/CE/I2267) in the ADAPT Centre (www.adaptcentre.ie) at Dublin City University, European Union Seventh Framework Programme FP7/2007-2013 under grant agreement PIAP-GA-2012-324414 (Abu-MaTran)., ADAPT Centre for Digital Content Technology at Dublin City University is funded under the Science Foundation Ireland Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.
ID Code:
23311
Deposited On:
17 May 2019 13:58 by
Thomas Murtagh
. Last Modified 08 Feb 2023 13:22