Skip to main content
DORAS
DCU Online Research Access Service
Login (DCU Staff Only)
Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search

Khwileh, Ahmad, Afli, Haithem ORCID: 0000-0002-7449-4707, Jones, Gareth J.F. ORCID: 0000-0003-2923-8365 and Way, Andy ORCID: 0000-0001-5736-5930 (2017) Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search. In: Third Arabic Natural Language Processing Workshop (WANLP), 3 Apr 2017, Valencia, Spain.

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
293kB

Abstract

Cross Language Information Retrieval (CLIR) systems are a valuable tool to enable speakers of one language to search for content of interest expressed in a different language. A group for whom this is of particular interest is bilingual Arabic speakers who wish to search for English language content using information needs expressed in Arabic queries. A key challenge in CLIR is crossing the language barrier between the query and the documents. The most common approach to bridging this gap is automated query translation, which can be unreliable for vague or short queries. In this work, we examine the potential for improving CLIR effectiveness by predicting the translation effectiveness using Query Performance Prediction (QPP) techniques. We propose a novel QPP method to estimate the quality of translation for an Arabic-Engish Cross-lingual User-generated Speech Search (CLUGS) task. We present an empirical evaluation that demonstrates the quality of our method on alternative translation outputs extracted from an Arabic-to-English Machine Translation system developed for this task. Finally, we show how this framework can be integrated in CLUGS to find relevant translations for improved retrieval performance.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > ADAPT
Published in: Habash, Nizar and Diab, Mona and Darwish, Kareem and El-Hajj, Wassim, (eds.) Proceedings of The Third Arabic Natural Language Processing Workshop (WANLP). . Association for Computational Linguistics (ACL).
Publisher:Association for Computational Linguistics (ACL)
Official URL:http://dx.doi.org/10.18653/v1/W17-1313
Copyright Information:© Association for Computational Linguistics (ACL)
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland in the ADAPT Centre (Grant 13/RC/2106) (www.adaptcentre.ie) at Dublin City University
ID Code:23385
Deposited On:30 May 2019 15:31 by Thomas Murtagh . Last Modified 31 Jul 2019 08:49

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

Altmetric
- Altmetric
+ Altmetric
  • Student Email
  • Staff Email
  • Student Apps
  • Staff Apps
  • Loop
  • Disclaimer
  • Privacy
  • Contact Us