Apidianaki, Marianna and He, Yifan (2010) An algorithm for cross-lingual sense-clustering tested in a MT evaluation setting. In: The 7th International Workshop on Spoken Language Translation (IWSLT 2010), 2-3 December, Paris, France.
Abstract
Unsupervised sense induction methods offer a solution to the
problem of scarcity of semantic resources. These methods
automatically extract semantic information from textual data
and create resources adapted to specific applications and domains of interest. In this paper, we present a clustering algorithm for cross-lingual sense induction which generates
bilingual semantic inventories from parallel corpora. We describe the clustering procedure and the obtained resources. We then proceed to a large-scale evaluation by integrating the resources into a Machine Translation (MT) metric (METEOR). We show that the use of the data-driven sense-cluster inventories leads to better correlation with human judgments of translation quality, compared to precision-based metrics, and to improvements similar to those obtained when a handcrafted semantic resource is used.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Workshop |
Refereed: | Yes |
Uncontrolled Keywords: | machine translation metric; METEOR; parallel corpora |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | Research Institutes and Centres > Centre for Next Generation Localisation (CNGL) DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
ID Code: | 16414 |
Deposited On: | 01 Jul 2011 13:12 by Shane Harper . Last Modified 19 Jul 2018 14:53 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
257kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record