Improving character-based decoding using target-side morphological
information for neural machine translation
Passban, Peyman, Liu, QunORCID: 0000-0002-7000-1792 and Way, AndyORCID: 0000-0001-5736-5930
(2018)
Improving character-based decoding using target-side morphological
information for neural machine translation.
In: 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. (NAACL 2018), 1-6 June 2018, New Orleans, LA, USA.
Recently, neural machine translation
(NMT) has emerged as a powerful alternative to conventional statistical approaches.
However, its performance drops considerably in the presence of morphologically
rich languages (MRLs). Neural engines
usually fail to tackle the large vocabulary
and high out-of-vocabulary (OOV) word
rate of MRLs. Therefore, it is not suitable
to exploit existing word-based models
to translate this set of languages. In this
paper, we propose an extension to the
state-of-the-art model of Chung et al.
(2016), which works at the character level
and boosts the decoder with target-side
morphological information. In our architecture, an additional morphology table
is plugged into the model. Each time the
decoder samples from a target vocabulary,
the table sends auxiliary signals from the
most relevant affixes in order to enrich the
decoder’s current state and constrain it to
provide better predictions. We evaluated
our model to translate English into German, Russian, and Turkish as three MRLs
and observed significant improvements.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long Papers).
1.
Association for Computational Linguistics.
This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:
ADAPT Centre for Digital Content Technology which is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.
ID Code:
23347
Deposited On:
22 May 2019 15:09 by
Thomas Murtagh
. Last Modified 22 May 2019 15:09