The ADAPT bilingual document alignment system at WMT16

Lohar, Pintu ORCID: 0000-0002-5328-1585, Afli, Haithem ORCID: 0000-0002-7449-4707, Liu, Chao-Hong ORCID: 0000-0002-1235-6026 and Way, Andy ORCID: 0000-0001-5736-5930 (2016) The ADAPT bilingual document alignment system at WMT16. In: First Conference on Machine Translation (WMT16), 11-12 Aug 2016, Berlin, Germany.

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

Comparable corpora have been shown to be useful in several multilingual natural language processing (NLP) tasks. Many previous papers have focused on how to improve the extraction of parallel data from this kind of corpus on different levels. In this paper, we are interested in improving the quality of bilingual comparable corpora according to increased document alignment score. We describe our participation in the bilingual document alignment shared task of the First Conference on Machine Translation (WMT16). We propose a technique based on sourceto-target sentence- and word-based scores and the fraction of matched source named entities. We performed our experiments on English-to-French document alignments for this bilingual task.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Conference
Refereed:	Yes
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Initiatives and Centres > ADAPT
Published in:	Proceedings of the First Conference on Machine Translation: Shared Task Papers. 2. Association for Computational Linguistics (ACL).
Publisher:	Association for Computational Linguistics (ACL)
Official URL:	http://dx.doi.org/10.18653/v1/W16-2372
Copyright Information:	© 2016 Association for Computational Linguistics (ACL)
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:	Science Foundation Ireland in the ADAPT Centre (Grant 13/RC/2106) (www.adaptcentre.ie) at Dublin City University
ID Code:	23374
Deposited On:	29 May 2019 09:24 by Thomas Murtagh . Last Modified 05 May 2023 16:27

Documents

Full text available as:

[thumbnail of The_ADAPT_bilingual_document_alignment_system_at_wmt16[1].pdf]

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
172kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record