Pre-reordering for neural machine translation:
helpful or harmful?
Du, JinhuaORCID: 0000-0002-3267-4881 and Way, AndyORCID: 0000-0001-5736-5930
(2017)
Pre-reordering for neural machine translation:
helpful or harmful?
Prague Bulletin of Mathematical Linguistics
(108).
pp. 171-181.
ISSN 1804-0462
Pre-reordering, a preprocessing to make the source-side word orders close to those of the
target side, has been proven very helpful for statistical machine translation (SMT) in improving
translation quality. However, is it the case in neural machine translation (NMT)? In this paper,
we firstly investigate the impact of pre-reordered source-side data on NMT, and then propose to
incorporate features for the pre-reordering model in SMT as input factors into NMT (factored
NMT). The features, namely parts-of-speech (POS), word class and reordered index, are encoded as feature vectors and concatenated to the word embeddings to provide extra knowledge
for NMT. Pre-reordering experiments conducted on Japanese↔English and Chinese↔English
show that pre-reordering the source-side data for NMT is redundant and NMT models trained
on pre-reordered data deteriorate translation performance. However, factored NMT using
SMT-based pre-reordering features on Japanese→English and Chinese→English is beneficial
and can further improve by 4.48 and 5.89 relative BLEU points, respectively, compared to the
baseline NMT system.