Improved named entity recognition using machine
translation-based cross-lingual information

Dandapat, Sandipan; Way, Andy

Dandapat, Sandipan and Way, Andy ORCID: 0000-0001-5736-5930 (2016) Improved named entity recognition using machine translation-based cross-lingual information. Computacion y Sistemas, 20 (3). pp. 495-504. ISSN 1405-5546

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

In this paper, we describe a technique to improve named entity recognition in a resource-poor language (Hindi) by using cross-lingual information. We use an on-line machine translation system and a separate word alignment phase to find the projection of each Hindi word into the translated English sentence. We estimate the cross-lingual features using an English named entity recognizer and the alignment information. We use these cross-lingual features in a support vector machine-based classifier. The use of cross-lingual features improves F1 score by 2.1 points absolute (2.9% relative) over a good-performing baseline model.

Metadata

Item Type:	Article (Published)
Refereed:	Yes
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT
Publisher:	Instituto Politécnico Nacional
Official URL:	http://dx.doi.org/10.13053/CyS-20-3-2468
Copyright Information:	© 2016 Instituto Politécnico Nacional
ID Code:	23236
Deposited On:	02 May 2019 13:39 by INVALID USER. Last Modified 02 May 2019 13:39

Documents

Full text available as:

[thumbnail of Improved Named Entity Recognition using Machine Translation-based Cross-lingual Information..pdf]

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
125kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Improved named entity recognition using machine translation-based cross-lingual information

Downloads