Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Incorporating source-language paraphrases into phrase-based SMT with confusion networks

Jiang, Jie, Du, Jinhua orcid logoORCID: 0000-0002-3267-4881 and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2011) Incorporating source-language paraphrases into phrase-based SMT with confusion networks. In: SSST-2011: The Fifth Workshop on Syntax and Structure in Statistical Translation , 23 June 2011, Portland, Oregon, USA.

Abstract
To increase the model coverage, sourcelanguage paraphrases have been utilized to boost SMT system performance. Previous work showed that word lattices constructed from paraphrases are able to reduce out-ofvocabulary words and to express inputs in different ways for better translation quality. However, such a word-lattice-based method suffers from two problems: 1) path duplications in word lattices decrease the capacities for potential paraphrases; 2) lattice decoding in SMT dramatically increases the search space and results in poor time efficiency. Therefore, in this paper, we adopt word confusion networks as the input structure to carry source-language paraphrase information. Similar to previous work, we use word lattices to build word confusion networks for merging of duplicated paths and faster decoding. Experiments are carried out on small-, medium- and large-scale English– Chinese translation tasks, and we show that compared with the word-lattice-based method, the decoding time on three tasks is reduced significantly (up to 79%) while comparable translation quality is obtained on the largescale task.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Workshop
Refereed:Yes
Uncontrolled Keywords:chinese translation; statistical machine translation; SMT
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:Research Institutes and Centres > Centre for Next Generation Localisation (CNGL)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Published in: Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation. . Association for Computational Linguistics.
Publisher:Association for Computational Linguistics
Official URL:http://aclweb.org/anthology/W/W11/W11-1004.pdf
Copyright Information:© 2011 ACL
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:16434
Deposited On:22 Jul 2011 10:46 by Shane Harper . Last Modified 18 Jan 2023 12:37
Documents

Full text available as:

[thumbnail of Incorporating_Source-Language_Paraphrases_into_Phrase-Based_SMT_with_Confusion_Networks.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record