Kumari, Divya, Ekbal, Asif ORCID: 0000-0003-3612-8834, Haque, Rejwanul ORCID: 0000-0003-1680-0099, Bhattacharyya, Pushpak ORCID: 0000-0001-5319-5508 and Way, Andy ORCID: 0000-0001-5736-5930 (2021) Reinforced NMT for sentiment and content preservation in low-resource scenario. ACM Transactions on Asian and Low-Resource Language Information Processing, 20 (4). ISSN 2375-4699
Abstract
The preservation of domain knowledge from source to the target is crucial in any translation workflows. Hence, translation service providers that use machine translation (MT) in production could reasonably expect that the translation process should transfer both the underlying pragmatics and the semantics of the sourceside sentences into the target language. However, recent studies suggest that the MT systems often fail to preserve such crucial information (e.g., sentiment, emotion, gender traits) embedded in the source text in the target. In this context, the raw automatic translations are often directly fed to other natural language processing (NLP) applications (e.g., sentiment classifier) in a cross-lingual platform. Hence, the loss of such crucial information during the translation could negatively affect the performance of such downstream NLP tasks that heavily rely on the output of the MT systems. In our current research, we carefully balance both the sides (i.e., sentiment and semantics) during translation, by controlling a global-attention-based neural MT (NMT), to generate translations that encode the underlying sentiment of a source sentence while preserving its non-opinionated semantic content. Toward this, we use a state-of-the-art reinforcement learning method, namely, actor-critic, that includes a novel reward combination module, to fine-tune the NMT system so that it learns to generate translations that are best suited for a downstream task, viz. sentiment classification while ensuring the source-side semantics is intact in the process. Experimental results for Hindi–English language pair show that our proposed method significantly improves the performance of the sentiment classifier and alongside results in an improved NMT system.
Metadata
Item Type: | Article (Published) |
---|---|
Refereed: | Yes |
Additional Information: | Article Number: 70 |
Uncontrolled Keywords: | neural machine translation; sentiment preservation; actor-critic; reinforcement learning; BERT |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT |
Publisher: | Association for Computing Machinery (ACM) |
Official URL: | https://dx.doi.org/10.1145/3450970 |
Copyright Information: | © 2021 The Authors. |
ID Code: | 27449 |
Deposited On: | 28 Jul 2022 16:15 by Thomas Murtagh . Last Modified 11 May 2023 13:43 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0 695kB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record