Poncelas, Alberto ORCID: 0000-0002-5089-1687, Toral, Antonio ORCID: 0000-0003-2357-2960 and Way, Andy ORCID: 0000-0001-5736-5930 (2017) Extending feature decay algorithms using alignment entropy. In: FETLT 2016: Future and Emerging TrenFETLT 2016: Future and Emerging Trends in Language Technologies, Machine Learning and Big Datauage Technologies, Machine Learning and Big Data. 2nd International Workshop, 30 Nov- 2 Dec 2016, Seville, Spain.
Abstract
In machine-learning applications, data selection is of crucial importance if good runtime performance is to be achieved. Feature Decay Algorithms (FDA) have demonstrated excellent performance in a number of
tasks. While the decay function is at the heart of the success of FDA,
its parameters are initialised with the same weights. In this paper, we
investigate the effect on Machine Translation of assigning more appropriate weights to words using word-alignment entropy. In experiments on
German to English, we show the effect of calculating these weights using two popular alignment methods, GIZA++ and FastAlign, using both
automatic and human evaluations. We demonstrate that our novel FDA
model is a promising research direction.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | Data selection; Machine translation; Mathematical foundations |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT |
Published in: | Proceedings of FETLT 2016: Future and Emerging Trends in Language Technologies, Machine Learning and Big Data. Lecture Notes in Computer Science 10341. Springer. |
Publisher: | Springer |
Official URL: | http://dx.doi.org/10.1007/978-3-319-69365-1_14 |
Copyright Information: | © 2016 Springer |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
Funders: | ADAPT Centre for Digital Content Technology, funded under the SFI Research Centres Programme (Grant 13/RC/2106), European Regional Development Fund, and the European Union Seventh Framework Programme FP7/2007-2013 under grant agreement PIAP-GA-2012-324414 (AbuMaTran) |
ID Code: | 23232 |
Deposited On: | 02 May 2019 12:00 by Thomas Murtagh . Last Modified 22 Jan 2021 14:17 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
298kB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record