Improved named entity recognition using machine
translation-based cross-lingual information
Dandapat, Sandipan and Way, AndyORCID: 0000-0001-5736-5930
(2016)
Improved named entity recognition using machine
translation-based cross-lingual information.
Computacion y Sistemas, 20
(3).
pp. 495-504.
ISSN 1405-5546
In this paper, we describe a technique to improve named entity recognition in a resource-poor language (Hindi) by using cross-lingual information. We
use an on-line machine translation system and a separate word alignment phase
to find the projection of each Hindi word into the translated English sentence. We
estimate the cross-lingual features using an English named entity recognizer and
the alignment information. We use these cross-lingual features in a support vector
machine-based classifier. The use of cross-lingual features improves F1 score by
2.1 points absolute (2.9% relative) over a good-performing baseline model.