Gap between theory and practice: noise sensitive word alignment in machine translation

Okita, Tsuyoshi; Graham, Yvette; Way, Andy

Okita, Tsuyoshi, Graham, Yvette and Way, Andy ORCID: 0000-0001-5736-5930 (2010) Gap between theory and practice: noise sensitive word alignment in machine translation. In: WAPA 2010 - First Workshop on Applications of Pattern Analysis, 1-3 September 2010, Windsor, UK.

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

Word alignment is to estimate a lexical translation probability p(e|f), or to estimate the correspondence g(e, f) where a function g outputs either 0 or 1, between a source word f and a target word e for given bilingual sentences. In practice, this formulation does not consider the existence of ‘noise’ (or outlier) which may cause problems depending on the corpus. N-to-m mapping objects, such as paraphrases, non-literal translations, and multiword expressions, may appear as both noise and also as valid training data. From this perspective, this paper tries to answer the following two questions: 1) how to detect stable patterns where noise seems legitimate, and 2) how to reduce such noise, where applicable, by supplying extra information as prior knowledge to a word aligner.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Workshop
Refereed:	Yes
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	Research Institutes and Centres > Centre for Next Generation Localisation (CNGL) Research Institutes and Centres > National Centre for Language Technology (NCLT)
Published in:	Workshop on Applications of Pattern Analysis. JMLR Workshop and Conference Proceedings 11. Journal of Machine Learning Research.
Publisher:	Journal of Machine Learning Research
Official URL:	http://jmlr.csail.mit.edu/proceedings/papers/v11/
Copyright Information:	Copyright 2010 the authors
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:	Science Foundation Ireland
ID Code:	15800
Deposited On:	10 Nov 2010 15:01 by Shane Harper . Last Modified 12 Aug 2020 17:20

Documents

Full text available as:

[thumbnail of Gap_Between_Theory_and_Practice.pdf]

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
134kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Gap between theory and practice: noise sensitive word alignment in machine translation

Downloads