Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Voxento 3.0: A Prototype Voice-Controlled Interactive Search Engine for Lifelogs

Alateeq, Ahmed orcid logoORCID: 0000-0001-7916-6393, Roantree, Mark orcid logoORCID: 0000-0002-1329-2570 and Gurrin, Cathal orcid logoORCID: 0000-0003-2903-3968 (2022) Voxento 3.0: A Prototype Voice-Controlled Interactive Search Engine for Lifelogs. In: 5th Annual on Lifelog Search Challenge, 27-30 June 2022, Newark, NJ, USA. ISBN 978-1-4503-9239-6

Abstract
Voxento is an interactive voice-based retrieval system for lifelogs which has been redeveloped and optimised to participate in the fifth Lifelog Search Challenge LSC’22, at ACM ICMR’22. Based on the previous experience in the LSC competition and ranked in the top 4 in the last LSC’21 competition among 17 participants, we present a revised version of Voxento to address the critical points to improve the efficiency of retrieval tasks in lifelog datasets. Basically, Voxento provides a spoken interface to the lifelog data, which facilitates an expert and novice user to interact with a personal lifelog using a range of vocal commands and interactions. Briefly, we made some important improvements to support both the retrieval of content and system interaction. This latest version has been enhanced with the addition of a text-based search feature, new filters based on new metadata provided in lifelog data, rich visual information and features and enhanced speech query. Also, the data preparation tasks comprised a new function to reduce the number of non-relevant images and the latest CLIP model version used to derive features from images. The long term development of Voxento includes a lifelog retrieval that supports speech and conversation interaction with less physical actions required by users such as using a mouse. The system presented here uses a desktop computer in order to participate in the LSC’22 competition with the option to use voice interaction or standard text-based retrieval.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:lifelog; interactive retrieval; voice interaction; speech recognition; speech synthesis
Subjects:UNSPECIFIED
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > INSIGHT Centre for Data Analytics
Published in: LSC '22: Proceedings of the 5th Annual on Lifelog Search Challenge. . Association for Computing Machinery (ACM). ISBN 978-1-4503-9239-6
Publisher:Association for Computing Machinery (ACM)
Official URL:https://dx.doi.org/10.1145/3512729.3533009
Copyright Information:© 2022 The Authors. Open Access (CC-BY 4.0)
Funders:Science Foundation Ireland SFI/12/RC/2289-P2
ID Code:27546
Deposited On:12 Aug 2022 10:41 by Thomas Murtagh . Last Modified 03 Mar 2023 12:45
Documents

Full text available as:

[thumbnail of 3512729.3533009.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
4MB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record