Investigating cross-language speech retrieval for a spontaneous conversational speech collection
Inkpen, Diana and Alzghool, Muath and Jones, Gareth J.F. and Oard, Douglas W. (2006) Investigating cross-language speech retrieval for a spontaneous conversational speech collection. In: HLT-NAACL 2006 - The Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting, 8-9 June 2006, New York, USA.
Full text available as:
Cross-language retrieval of spontaneous speech combines the challenges of working with noisy automated transcription and language translation. The CLEF 2005 Cross-Language Speech Retrieval (CL-SR) task provides a standard test collection to investigate these challenges. We show that we can improve retrieval performance: by careful selection of the term weighting scheme; by decomposing automated transcripts into
phonetic substrings to help ameliorate transcription
errors; and by combining automatic transcriptions with manually-assigned metadata. We further show that topic translation with online machine translation resources
yields effective CL-SR.
Archive Staff Only: edit this record