Skip to main content
DORAS
DCU Online Research Access Service
Login (DCU Staff Only)
Selecting artificially-generated sentences for fine-tuning neural machine translation

Poncelas, Alberto ORCID: 0000-0002-5089-1687 and Way, Andy ORCID: 0000-0001-5736-5930 (2019) Selecting artificially-generated sentences for fine-tuning neural machine translation. In: 12th International Conference on Natural Language Generation, 29 Oct - 1 Nov 2019, Tokyo, Japan.

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
192kB

Abstract

Neural Machine Translation (NMT) models tend to achieve best performance when larger sets of parallel sentences are provided for trai- ning. For this reason, augmenting the training set with artificially-generated sentence pairs can boost performance. Nonetheless, the performance can also be im- proved with a small number of sentences if they are in the same domain as the test set. Accordingly, we want to explore the use of artificially-generated sentences along with data-selection algorithms to improve German- to-English NMT models trained solely with authentic data. In this work, we show how artificially- generated sentences can be more beneficial than authentic pairs, and demonstrate their ad- vantages when used in combination with data- selection algorithms.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:Backtranslation
Subjects:Computer Science > Computational linguistics
Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > ADAPT
Published in: Proceedings of 12th International Conference on Natural Language Generation. .
Official URL:https://www.inlg2019.com/assets/papers/197_Paper.pdf
Copyright Information:© 2019 The authors
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:SFI Research Centres Programme (Grant 13/RC/2106)
ID Code:23903
Deposited On:05 Nov 2019 09:47 by Andrew Way . Last Modified 22 Jan 2021 14:21

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

Altmetric
- Altmetric
+ Altmetric
  • Student Email
  • Staff Email
  • Student Apps
  • Staff Apps
  • Loop
  • Disclaimer
  • Privacy
  • Contact Us