Feature decay algorithms for neural machine translation
Poncelas, AlbertoORCID: 0000-0002-5089-1687, Maillette de Buy Wenniger, Gideon and Way, AndyORCID: 0000-0001-5736-5930
(2018)
Feature decay algorithms for neural machine translation.
In: 21st Annual Conference of The European Association for Machine Translation, 28-30 May 2018, Alicante, Spain.
Neural Machine Translation (NMT) systems require a lot of data to be competitive. For this reason, data selection techniques are used only for finetuning systems that have been trained with larger amounts of data. In this work we aim to use Feature Decay Algorithms (FDA) data selection techniques not only to fine-tune a system but also to build a complete system with less data. Our findings reveal that it is possible to find a subset of sentence pairs, that outperforms by 1.11 BLEU points the full training corpus, when used for training a German-English NMT system .