Browse DORAS
Browse Theses
Latest Additions
Creative Commons License
Except where otherwise noted, content on this site is licensed for use under a:

Investigating cross-language speech retrieval for a spontaneous conversational speech collection

Inkpen, Diana and Alzghool, Muath and Jones, Gareth J.F. and Oard, Douglas W. (2006) Investigating cross-language speech retrieval for a spontaneous conversational speech collection. In: HLT-NAACL 2006 - The Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting, 8-9 June 2006, New York, USA.

Full text available as:

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


Cross-language retrieval of spontaneous speech combines the challenges of working with noisy automated transcription and language translation. The CLEF 2005 Cross-Language Speech Retrieval (CL-SR) task provides a standard test collection to investigate these challenges. We show that we can improve retrieval performance: by careful selection of the term weighting scheme; by decomposing automated transcripts into phonetic substrings to help ameliorate transcription errors; and by combining automatic transcriptions with manually-assigned metadata. We further show that topic translation with online machine translation resources yields effective CL-SR.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Subjects:Computer Science > Information retrieval
DCU Faculties and Centres:Research Initiatives and Centres > Centre for Digital Video Processing (CDVP)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Publisher:Association for Computational Linguistics
Official URL:
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:358
Deposited On:19 Mar 2008 by DORAS Administrator. Last Modified 04 Feb 2009 11:05

Download statistics

Archive Staff Only: edit this record