Domain-specific query translation for multilingual information access using machine translation augmented with dictionaries mined from Wikipedia
Jones, Gareth J.F. and Fantino, Fabio and Newman, Eamonn and Zhang, Ying (2008) Domain-specific query translation for multilingual information access using machine translation augmented with dictionaries mined from Wikipedia. In: CLIA 2008 - 2nd International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies, 11 Jan 2008, Hyderabad, India.
Full text available as:
Accurate high-coverage translation is a vital component of reliable cross language information access (CLIA) systems. While machine translation (MT) has been shown to be effective for CLIA tasks in previous evaluation workshops, it is not well suited to specialized tasks where domain specific translations are required. We demonstrate that effective query translation for CLIA can be achieved in the domain of cultural heritage (CH). This is performed by augmenting a standard MT system with domainspecific
phrase dictionaries automatically mined from the online Wikipedia. Experiments using our hybrid translation system
with sample query logs from users of CH websites demonstrate a large improvement in the accuracy of domain specific phrase detection and translation.
Archive Staff Only: edit this record