Wang, LongyueORCID: 0000-0002-9062-6183, Tu, Zhaopeng, Zhang, XiaojunORCID: 0000-0003-3514-1981, Li, Hang, Way, AndyORCID: 0000-0001-5736-5930 and Liu, QunORCID: 0000-0002-7000-1792
(2016)
A Novel approach to dropped pronoun translation.
In: 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2016), 12-17 June 2016, San Diego, CA, USA.
Dropped Pronouns (DP) in which pronouns
are frequently dropped in the source language
but should be retained in the target language
are challenge in machine translation. In response to this problem, we propose a semisupervised approach to recall possibly missing
pronouns in the translation. Firstly, we build
training data for DP generation in which the
DPs are automatically labelled according to
the alignment information from a parallel corpus. Secondly, we build a deep learning-based
DP generator for input sentences in decoding
when no corresponding references exist. More
specifically, the generation is two-phase: (1)
DP position detection, which is modeled as a
sequential labelling task with recurrent neural
networks; and (2) DP prediction, which employs a multilayer perceptron with rich features. Finally, we integrate the above outputs
into our translation system to recall missing
pronouns by both extracting rules from the
DP-labelled training data and translating the
DP-generated input sentences. Experimental
results show that our approach achieves a significant improvement of 1.58 BLEU points in
translation performance with 66% F-score for
DP generation accuracy.
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
.
Association for Computational Linguistics.