English-Hindi transliteration using context-informed PB-SMT: the DCU system for NEWS 2009
Haque, Rejwanul and Dandapat, Sandipan and Srivastava, Ankit Kumar and Naskar, Sudip Kumar and Way, Andy (2009) English-Hindi transliteration using context-informed PB-SMT: the DCU system for NEWS 2009. In: NEWS 2009 - Named Entities Workshop, 7 August 2009, Singapore.
Full text available as:
This paper presents English—Hindi transliteration in the NEWS 2009 Machine Transliteration Shared Task adding source context modeling into state-of-the-art log-linear phrase-based statistical machine translation (PB-SMT). Source context features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. We use a memory-based classification
framework that enables efficient estimation of these features while avoiding data sparseness problems.We carried out experiments both at character and transliteration unit (TU) level. Position-dependent source context features produce significant improvements in terms of all evaluation metrics.
Archive Staff Only: edit this record