Dandapat, Sandipan and Way, Andy ORCID: 0000-0001-5736-5930 (2016) Improved named entity recognition using machine translation-based cross-lingual information. Computacion y Sistemas, 20 (3). pp. 495-504. ISSN 1405-5546
Abstract
In this paper, we describe a technique to improve named entity recognition in a resource-poor language (Hindi) by using cross-lingual information. We
use an on-line machine translation system and a separate word alignment phase
to find the projection of each Hindi word into the translated English sentence. We
estimate the cross-lingual features using an English named entity recognizer and
the alignment information. We use these cross-lingual features in a support vector
machine-based classifier. The use of cross-lingual features improves F1 score by
2.1 points absolute (2.9% relative) over a good-performing baseline model.
Metadata
Item Type: | Article (Published) |
---|---|
Refereed: | Yes |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT |
Publisher: | Instituto Politécnico Nacional |
Official URL: | http://dx.doi.org/10.13053/CyS-20-3-2468 |
Copyright Information: | © 2016 Instituto Politécnico Nacional |
ID Code: | 23236 |
Deposited On: | 02 May 2019 13:39 by Thomas Murtagh . Last Modified 02 May 2019 13:39 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
125kB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record