Browse DORAS
Browse Theses
Latest Additions
Creative Commons License
Except where otherwise noted, content on this site is licensed for use under a:

Robust large-scale EBMT with marker-based segmentation

Gough, Nano and Way, Andy (2004) Robust large-scale EBMT with marker-based segmentation. In: TMI 2004 - 10th International Conference on Theoretical and Methodological Issues in Machine Translation, 4-6 October 2004, Baltimore, Maryland, USA.

Full text available as:

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


Previous work on marker-based EBMT [Gough & Way, 2003, Way & Gough, 2004] suffered from problems such as data-sparseness and disparity between the training and test data. We have developed a large-scale robust EBMT system. In a comparison with the systems listed in [Somers, 2003], ours is the third largest EBMT system and certainly the largest English-French EBMT system. Previous work used the on-line MT system Logomedia to translate source language material as a means of populating the system’s database where bitexts were unavailable. We derive our sententially aligned strings from a Sun Translation Memory (TM) and limit the integration of Logomedia to the derivation of our word-level lexicon. We also use Logomedia to provide a baseline comparison for our system and observe that we outperform Logomedia and previous marker-based EBMT systems in a number of tests.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Uncontrolled Keywords:example-based machine translation;
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:Research Initiatives and Centres > National Centre for Language Technology (NCLT)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Official URL:
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:15305
Deposited On:15 Mar 2010 11:46 by DORAS Administrator. Last Modified 28 Apr 2010 11:37

Download statistics

Archive Staff Only: edit this record