Armstrong, Stephen (2007) Using EMBT to produce foreign language subtitles. Master of Science thesis, Dublin City University.
Abstract
Due to limited budgets and an ever-diminishing timeframe for the production of foreign language subtitles, pressure on subtitle companies is at an all-time high. Although translation technologies are ubiquitous in other areas of translation, especially localisation, and have been helping translators work more efficiently for a number of years now (Lagoudaki, 2006), it is strange to note that subtitle companies have been slower to jump on the bandwagon. Recent research from both academia and the industry (O'Hagan, 2003; Carroll, 2004; Gambier, 2005) suggests that the inroads made in natural language processing and machine translation could go a long way to alleviating some of this pressure.
In this thesis, we set out to establish how example-based machine translation (EBMT) can be used to speed up the subtitling process, thus improving the throughput of the subtitler, and also as a means of automatically producing foreign language subtitles which subtitle companies may not normally provide, even though they would be extremely helpful for the viewing public.
Through the development of the modular corpus-based MT engine, MaTrEx (Stroppa
et al., 2006), and the collection of a large amount of subtitle data extracted from over
50 full-length features (Armstrong et al., 2006a), we were able to apply a number of
EBMT techniques to produce subtitles for the language directions German-English and English-German. These machine-produced subtitles were evaluated using a range of both well-established automatic metrics common to machine translation as well as some novel manual evaluation strategies. Both automatic metrics and the human evaluation were very useful in the developmental process where we were able to isolate and fix errors made by our system. In addition, through obtaining a human's perspective on the subtitles produced by our system, we were able to gauge the acceptability of these subtitles for public viewing, and have provided a solid grounding for future research into the acceptability of (semi-) automatically generated subtitles.
Metadata
Item Type: | Thesis (Master of Science) |
---|---|
Date of Award: | 2007 |
Refereed: | No |
Supervisor(s): | Way, Andy |
Uncontrolled Keywords: | example based machine translating; EBMT; natural language processing; subtitling; films; mchine produced subtitles |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 3.0 License. View License |
ID Code: | 17019 |
Deposited On: | 15 May 2012 14:45 by Fran Callaghan . Last Modified 19 Jul 2018 14:55 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
5MB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record