Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search

Khwileh, Ahmad, Afli, Haithem orcid logoORCID: 0000-0002-7449-4707, Jones, Gareth J.F. orcid logoORCID: 0000-0003-2923-8365 and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2017) Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search. In: Proceedings of The Third Arabic Natural Language Processing Workshop (WANLP), 3-4 Apr 2017, Valencia, Spain.

Cross Language Information Retrieval (CLIR) systems are a valuable tool to enable speakers of one language to search for content of interest expressed in a different language. A group for whom this is of particular interest is bilingual Arabic speakers who wish to search for English language content using information needs expressed in Arabic queries. A key challenge in CLIR is crossing the language barrier between the query and the documents. The most common approach to bridging this gap is automated query translation, which can be unreliable for vague or short queries. In this work, we examine the potential for improving CLIR effectiveness by predicting the translation effectiveness using Query Performance Prediction (QPP) techniques. We propose a novel QPP method to estimate the quality of translation for an Arabic-Engish Cross-lingual User-generated Speech Search (CLUGS) task. We present an empirical evaluation that demonstrates the quality of our method on alternative translation outputs extracted from an Arabic-to-English Machine Translation system developed for this task. Finally, we show how this framework can be integrated in CLUGS to find relevant translations for improved retrieval performance.
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Published in: Proceedings of The Third Arabic Natural Language Processing Workshop (WANLP). . Association for Computational Linguistics.
Publisher:Association for Computational Linguistics
Official URL:http://dx.doi.org/10.18653/v1/W17-1313
Copyright Information:© 2017 Association for Computational Linguistics
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland in the ADAPT Centre (Grant 13/RC/2106) (www.adaptcentre.ie) at Dublin City University.
ID Code:23341
Deposited On:22 May 2019 10:42 by Thomas Murtagh . Last Modified 31 Jul 2019 08:48

Full text available as:

[thumbnail of Finding_Relevant_Translations_for_Cross-lingual_User-generated_Speech_Search[1].pdf]
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


Downloads per month over past year

Archive Staff Only: edit this record