Using EMBT to produce foreign language subtitles

Armstrong, Stephen

Armstrong, Stephen (2007) Using EMBT to produce foreign language subtitles. Master of Science thesis, Dublin City University.

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

Due to limited budgets and an ever-diminishing timeframe for the production of foreign language subtitles, pressure on subtitle companies is at an all-time high. Although translation technologies are ubiquitous in other areas of translation, especially localisation, and have been helping translators work more efficiently for a number of years now (Lagoudaki, 2006), it is strange to note that subtitle companies have been slower to jump on the bandwagon. Recent research from both academia and the industry (O'Hagan, 2003; Carroll, 2004; Gambier, 2005) suggests that the inroads made in natural language processing and machine translation could go a long way to alleviating some of this pressure. In this thesis, we set out to establish how example-based machine translation (EBMT) can be used to speed up the subtitling process, thus improving the throughput of the subtitler, and also as a means of automatically producing foreign language subtitles which subtitle companies may not normally provide, even though they would be extremely helpful for the viewing public. Through the development of the modular corpus-based MT engine, MaTrEx (Stroppa et al., 2006), and the collection of a large amount of subtitle data extracted from over 50 full-length features (Armstrong et al., 2006a), we were able to apply a number of EBMT techniques to produce subtitles for the language directions German-English and English-German. These machine-produced subtitles were evaluated using a range of both well-established automatic metrics common to machine translation as well as some novel manual evaluation strategies. Both automatic metrics and the human evaluation were very useful in the developmental process where we were able to isolate and fix errors made by our system. In addition, through obtaining a human's perspective on the subtitles produced by our system, we were able to gauge the acceptability of these subtitles for public viewing, and have provided a solid grounding for future research into the acceptability of (semi-) automatically generated subtitles.

Metadata

Item Type:	Thesis (Master of Science)
Date of Award:	2007
Refereed:	No
Supervisor(s):	Way, Andy
Uncontrolled Keywords:	example based machine translating; EBMT; natural language processing; subtitling; films; mchine produced subtitles
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 3.0 License. View License
ID Code:	17019
Deposited On:	15 May 2012 14:45 by INVALID USER. Last Modified 19 Jul 2018 14:55

Documents

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
5MB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Using EMBT to produce foreign language subtitles

Downloads