Examining the contributions of automatic speech transcriptions and metadata sources for searching spontaneous conversational speech

Jones, Gareth J.F.; Zhang, Ke; Newman, Eamonn; Lam-Adesina, Adenike M.

Home
Browse By

Author

DCU Faculties and Centres

Theses

Subject

Year

Publication Type

Year of Award

Supervisors
About / FAQ
Statistics
Login (DCU Staff Only)

Examining the contributions of automatic speech transcriptions and metadata sources for searching spontaneous conversational speech

Jones, Gareth J.F. ORCID: 0000-0003-2923-8365, Zhang, Ke, Newman, Eamonn ORCID: 0000-0002-0310-0539 and Lam-Adesina, Adenike M. (2007) Examining the contributions of automatic speech transcriptions and metadata sources for searching spontaneous conversational speech. In: ACM SIGIR 2007 Workshop - Searching Spontaneous Conversational Speech, 27 July 2007, Amsterdam, The Netherlands.

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

The searching spontaneous speech can be enhanced by combining automatic speech transcriptions with semantically related metadata. An important question is what can be expected from search of such transcriptions and different sources of related metadata in terms of retrieval effectiveness. The Cross-Language Speech Retrieval (CL-SR) track at recent CLEF workshops provides a spontaneous speech test collection with manual and automatically derived metadata fields. Using this collection we investigate the comparative search effectiveness of individual fields comprising automated transcriptions and the available metadata. A further important question is how transcriptions and metadata should be combined for the greatest benefit to search accuracy. We compare simple field merging of individual fields with the extended BM25 model for weighted field combination (BM25F). Results indicate that BM25F can produce improved search accuracy, but that it is currently important to set its parameters suitably using a suitable training set.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Workshop
Refereed:	Yes
Additional Information:	Workshop held in conjunction with the 30th Annual International ACM SIGIR Conference 27 July 2007, Amsterdam
Uncontrolled Keywords:	searching spontaneous speech transcriptions; metadata; data fusion; field combination;
Subjects:	Computer Science > Information retrieval
DCU Faculties and Centres:	Research Institutes and Centres > Centre for Digital Video Processing (CDVP)
Publisher:	Centre for Telematics and Information Technology, Enschede, The Netherlands
Official URL:	http://hmi.ewi.utwente.nl/sscs
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:	383
Deposited On:	31 Mar 2008 by DORAS Administrator . Last Modified 25 Oct 2018 12:15

Documents

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
172kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Examining the contributions of automatic speech transcriptions and metadata sources for searching spontaneous conversational speech

Downloads