Domain-specific query translation for multilingual information access using machine translation augmented with dictionaries mined from Wikipedia
Jones, Gareth J.F.ORCID: 0000-0003-2923-8365, Fantino, Fabio, Newman, EamonnORCID: 0000-0002-0310-0539 and Zhang, Ying
(2008)
Domain-specific query translation for multilingual information access using machine translation augmented with dictionaries mined from Wikipedia.
In: CLIA 2008 - 2nd International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies, 11 Jan 2008, Hyderabad, India.
Accurate high-coverage translation is a vital component of reliable cross language information access (CLIA) systems. While machine translation (MT) has been shown to be effective for CLIA tasks in previous evaluation workshops, it is not well suited to specialized tasks where domain specific translations are required. We demonstrate that effective query translation for CLIA can be achieved in the domain of cultural heritage (CH). This is performed by augmenting a standard MT system with domainspecific
phrase dictionaries automatically mined from the online Wikipedia. Experiments using our hybrid translation system
with sample query logs from users of CH websites demonstrate a large improvement in the accuracy of domain specific phrase detection and translation.
Metadata
Item Type:
Conference or Workshop Item (Paper)
Event Type:
Workshop
Refereed:
Yes
Additional Information:
Workshop in conjunction with IJCNLP 2008 - The Third International Joint Conference on Natural Language Processing, 7-12 Jan, 2008, Hyderabad, India.