Jones, Gareth J.F. ORCID: 0000-0003-2923-8365, Fantino, Fabio, Fuller, Marguerite, Newman, Eamonn ORCID: 0000-0002-0310-0539 and Zhang, Ying (2009) Domain-specific query translation for multilingual access to digital libraries. In: 1st First Natural Language Processing for Digital Libraries (NLP4DL) Workshop, 15 June 2009, Viareggio, Italy.
Abstract
Accurate high-coverage translation is a vital component of reliable cross language information access (CLIR) systems. This is particularly true of access to archives such as Digital Libraries which are often specific to certain domains. While general machine translation (MT) has been shown to be effective for CLIR tasks in information retrieval evaluation workshops, it is not well suited to specialized tasks where domain specific translations are required. We demonstrate that effective query translation
in the domain of cultural heritage (CH) can be achieved by augmenting a standard MT system with domain-specific phrase dictionaries automatically mined from the online Wikipedia. Experiments using our hybrid translation system with sample query logs from users of CH websites demonstrate a large improvement in the accuracy of domain specific phrase detection and translation.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Workshop |
Refereed: | Yes |
Uncontrolled Keywords: | cross language information retrieval; CLIR; machine translation; MT |
Subjects: | Computer Science > Information retrieval |
DCU Faculties and Centres: | Research Institutes and Centres > Centre for Digital Video Processing (CDVP) |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
ID Code: | 16503 |
Deposited On: | 18 Aug 2011 14:51 by Shane Harper . Last Modified 25 Oct 2018 11:16 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
384kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record