Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Towards methods for efficient access to spoken content in the AMI corpus

Jones, Gareth J.F. orcid logoORCID: 0000-0003-2923-8365, Eskevich, Maria orcid logoORCID: 0000-0002-1242-0753 and Gyarmati, Ágnes (2010) Towards methods for efficient access to spoken content in the AMI corpus. In: the Workshop on Searching Spontaneous Conversational Speech at ACM Multimedia 2010 (SSCS '10), 29 Oct 2010, Florence, Italy.

Abstract
Increasing amounts of informal spoken content are being collected. This material does not have clearly defined document forms either in terms of structure or topical content, e.g. recordings of meetings, lectures and personal data sources. Automated search of this content poses challenges beyond retrieval of defined documents, including definition of search items and location of relevant content within them. While most existing work on speech search focused on clearly defined document units, in this paper we describe our initial investigation into search of meeting content using the AMI meeting collection. Manual and automated transcripts of meetings are first automatically segmented into topical units. A known-item search task is then performed using presentation slides from the meetings as search queries to locate relevant sections of the meetings. Query slides were selected corresponding to well recognised and poorly recognised spoken content, and randomly selected slides. Experimental results show that relevant items can be located with reasonable accuracy using a standard information retrieval approach, and that there is a clear relationship between automatic transcription accuracy and retrieval effectiveness.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Workshop
Refereed:Yes
Uncontrolled Keywords:natural language processing;
Subjects:Computer Science > Information retrieval
DCU Faculties and Centres:Research Institutes and Centres > Centre for Next Generation Localisation (CNGL)
Research Institutes and Centres > National Centre for Language Technology (NCLT)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Publisher:Association for Computing Machinery
Official URL:http://portal.acm.org
Copyright Information:© 2010 ACM
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:16044
Deposited On:22 Jul 2011 13:21 by Shane Harper . Last Modified 10 Oct 2018 09:21
Documents

Full text available as:

[thumbnail of Towards_Methods_for_Efficient_Access_to_Spoken_Content_in_the_AMI_Corpus.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
375kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record