Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

A discriminative latent variable-based "DE" classifier for Chinese–English SMT

Du, Jinhua orcid logoORCID: 0000-0002-3267-4881 and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2010) A discriminative latent variable-based "DE" classifier for Chinese–English SMT. In: COLING 2010 - 23rd International Conference on Computational Linguistics, 23-27 August 2010, Beijing, China.

Abstract
Syntactic reordering on the source-side is an effective way of handling word order differences. The (DE) construction is a flexible and ubiquitous syntactic structure in Chinese which is a major source of error in translation quality. In this paper, we propose a new classifier model — discriminative latent variable model (DPLVM) — to classify the DE construction to improve the accuracy of the classification and hence the translation quality. We also propose a new feature which can automatically learn the reordering rules to a certain extent. The experimental results show that the MT systems using the data reordered by our proposed model outperform the baseline systems by 6.42% and 3.08% relative points in terms of the BLEU score on PB-SMT and hierarchical phrase-based MT respectively. In addition, we analyse the impact of DE annotation on word alignment and on the SMT phrase table.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:Research Institutes and Centres > Centre for Next Generation Localisation (CNGL)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Published in: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). . Association for Computational Linguistics.
Publisher:Association for Computational Linguistics
Official URL:http://www.aclweb.org/anthology/C/C10/C10-1033.pdf
Copyright Information:© 2010 Association for Computational Linguistics
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland
ID Code:15798
Deposited On:10 Nov 2010 14:33 by Shane Harper . Last Modified 09 Nov 2018 15:34
Documents

Full text available as:

[thumbnail of A_Discriminative_Latent_Variable-Based_“DE”_Classifier.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
273kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record