Inkpen, Diana, Alzghool, Muath, Jones, Gareth J.F. ORCID: 0000-0003-2923-8365 and Oard, Douglas W. (2006) Investigating cross-language speech retrieval for a spontaneous conversational speech collection. In: HLT-NAACL 2006 - The Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting, 8-9 June 2006, New York, USA.
Abstract
Cross-language retrieval of spontaneous speech combines the challenges of working with noisy automated transcription and language translation. The CLEF 2005 Cross-Language Speech Retrieval (CL-SR) task provides a standard test collection to investigate these challenges. We show that we can improve retrieval performance: by careful selection of the term weighting scheme; by decomposing automated transcripts into
phonetic substrings to help ameliorate transcription
errors; and by combining automatic transcriptions with manually-assigned metadata. We further show that topic translation with online machine translation resources
yields effective CL-SR.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Subjects: | Computer Science > Information retrieval |
DCU Faculties and Centres: | Research Institutes and Centres > Centre for Digital Video Processing (CDVP) DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Publisher: | Association for Computational Linguistics |
Official URL: | http://nlp.cs.nyu.edu/hlt-naacl06/ |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
ID Code: | 358 |
Deposited On: | 19 Mar 2008 by DORAS Administrator . Last Modified 25 Oct 2018 12:48 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
40kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record