Gupta, Kamal Kumar, Haque, RejwanulORCID: 0000-0003-1680-0099, Ekbal, Asif, Bhattacharyya, Pushpak and Way, AndyORCID: 0000-0001-5736-5930
(2020)
Syntax-informed interactive neural machine translation.
In: The International Joint Conference on Neural Networks (IJCNN), 19-24 July 2020, Glasgow, UK (Online).
In interactive machine translation (MT), human translators correct errors in automatic translations in collaboration with the MT systems, and this is an effective way to improve productivity gain in translation. Phrase-based statistical MT (PB-SMT) has been the mainstream approach to MT for the past 30 years, both in academia and industry. Neural MT (NMT), an end-to-end learning approach to MT, represents the current state-of-the-art in MT research. The recent studies on interactive MT have indicated that NMT can significantly outperform PB-SMT.
In this work, first we investigate the possibility of integrating lexical syntactic descriptions in the form of supertags into the state-of-the-art NMT model, Transformer. Then, we explore whether integration of supertags into Transformer could indeed reduce human efforts in translation in an interactive-predictive platform. From our investigation we found that our syntax-aware interactive NMT (INMT) framework significantly reduces simulated human efforts in the French–to–English and Hindi–to–English translation tasks, achieving a 2.65 point absolute corresponding to 5.65% relative improvement and a 6.55 point absolute corresponding to 19.1% relative improvement, respectively, in terms of word prediction accuracy (WPA) over the respective baselines.
This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:
Science Foundation Ireland (SFI) Research Centres Programme (Grant No. 13/RC/2106), European Regional Development Fund, Young Faculty Research Fellowship (YFRF), supported by Visvesvaraya PhD scheme for Electronics and IT, Ministry of Electronics and Information Technology (MeitY), Government of India, being implemented by Digital India Corporation (formerly MediaLab Asia)
ID Code:
24560
Deposited On:
22 Jul 2020 12:52 by
Rejwanul Haque
. Last Modified 07 Jan 2022 16:51