Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Exploiting alignment techniques in MATREX: the DCU machine translation system for IWSLT 2008

Ma, Yanjun, Tinsley, John, Hassan, Hany, Du, Jinhua orcid logoORCID: 0000-0002-3267-4881 and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2008) Exploiting alignment techniques in MATREX: the DCU machine translation system for IWSLT 2008. In: IWSLT 2008 - International Workshop on Spoken Language Translation, 20-21 October 2008, Hawaii, USA.

Abstract
In this paper, we give a description of the machine translation (MT) system developed at DCU that was used for our third participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2008). In this participation, we focus on various techniques for word and phrase alignment to improve system quality. Specifically, we try out our word packing and syntax-enhanced word alignment techniques for the Chinese–English task and for the English–Chinese task for the first time. For all translation tasks except Arabic–English, we exploit linguistically motivated bilingual phrase pairs extracted from parallel treebanks. We smooth our translation tables with out-of-domain word translations for the Arabic–English and Chinese–English tasks in order to solve the problem of the high number of out of vocabulary items. We also carried out experiments combining both in-domain and out-of-domain data to improve system performance and, finally, we deploy a majority voting procedure combining a language model based method and a translation-based method for case and punctuation restoration. We participated in all the translation tasks and translated both the single-best ASR hypotheses and the correct recognition results. The translation results confirm that our new word and phrase alignment techniques are often helpful in improving translation quality, and the data combination method we proposed can significantly improve system performance.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Workshop
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:Research Institutes and Centres > Centre for Next Generation Localisation (CNGL)
Research Institutes and Centres > National Centre for Language Technology (NCLT)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Official URL:http://www.mt-archive.info/IWSLT-2008-TOC.htm
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland, SFI 05/RF/CMS064, SFI O5/IN/1732, SFI 07/CE/I1142
ID Code:15198
Deposited On:16 Feb 2010 16:19 by DORAS Administrator . Last Modified 25 Jan 2019 10:20
Documents

Full text available as:

[thumbnail of MaEtAl_iwslt_08.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
153kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record