Selecting artificially-generated sentences for fine-tuning neural machine translation
Poncelas, AlbertoORCID: 0000-0002-5089-1687 and Way, AndyORCID: 0000-0001-5736-5930
(2019)
Selecting artificially-generated sentences for fine-tuning neural machine translation.
In: 12th International Conference on Natural Language Generation, 29 Oct - 1 Nov 2019, Tokyo, Japan.
Neural Machine Translation (NMT) models
tend to achieve best performance when larger
sets of parallel sentences are provided for trai-
ning. For this reason, augmenting the training
set with artificially-generated sentence pairs
can boost performance.
Nonetheless, the performance can also be im-
proved with a small number of sentences
if they are in the same domain as the test
set. Accordingly, we want to explore the use
of artificially-generated sentences along with
data-selection algorithms to improve German-
to-English NMT models trained solely with
authentic data.
In this work, we show how artificially-
generated sentences can be more beneficial
than authentic pairs, and demonstrate their ad-
vantages when used in combination with data-
selection algorithms.