A discriminative latent variable-based "DE" classifier
for Chinese–English SMT
Du, JinhuaORCID: 0000-0002-3267-4881 and Way, AndyORCID: 0000-0001-5736-5930
(2010)
A discriminative latent variable-based "DE" classifier
for Chinese–English SMT.
In: COLING 2010 - 23rd International Conference on Computational Linguistics, 23-27 August 2010, Beijing, China.
Syntactic reordering on the source-side
is an effective way of handling word order
differences. The (DE) construction
is a flexible and ubiquitous syntactic
structure in Chinese which is a major
source of error in translation quality.
In this paper, we propose a new classifier
model — discriminative latent variable
model (DPLVM) — to classify the
DE construction to improve the accuracy
of the classification and hence the translation
quality. We also propose a new feature
which can automatically learn the reordering
rules to a certain extent. The experimental
results show that the MT systems
using the data reordered by our proposed
model outperform the baseline systems
by 6.42% and 3.08% relative points
in terms of the BLEU score on PB-SMT
and hierarchical phrase-based MT respectively.
In addition, we analyse the impact
of DE annotation on word alignment and
on the SMT phrase table.