Alateeq, Ahmed ORCID: 0000-0001-7916-6393, Roantree, Mark ORCID: 0000-0002-1329-2570 and Gurrin, Cathal ORCID: 0000-0003-2903-3968 (2022) Voxento 3.0: A Prototype Voice-Controlled Interactive Search Engine for Lifelogs. In: 5th Annual on Lifelog Search Challenge, 27-30 June 2022, Newark, NJ, USA. ISBN 978-1-4503-9239-6
Abstract
Voxento is an interactive voice-based retrieval system for lifelogs which has been redeveloped and optimised to participate in the fifth Lifelog Search Challenge LSC’22, at ACM ICMR’22. Based on the previous experience in the LSC competition and ranked in the top 4 in the last LSC’21 competition among 17 participants, we present a revised version of Voxento to address the critical points to improve the efficiency of retrieval tasks in lifelog datasets. Basically, Voxento provides a spoken interface to the lifelog data, which facilitates an expert and novice user to interact with a personal lifelog using a range of vocal commands and interactions. Briefly, we made some important improvements to support both the retrieval of content and system interaction. This latest version has been enhanced with the addition of a text-based search feature, new filters based on new metadata provided in lifelog data, rich visual information and features and enhanced speech query. Also, the data preparation tasks comprised a new function to reduce the number of non-relevant images and the latest CLIP model version used to derive features from images. The long term development of Voxento includes a lifelog retrieval that supports speech and conversation interaction with less physical actions required by users such as using a mouse. The system presented here uses a desktop computer in order to participate in the LSC’22 competition with the option to use voice interaction or standard text-based retrieval.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | lifelog; interactive retrieval; voice interaction; speech recognition; speech synthesis |
Subjects: | UNSPECIFIED |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > INSIGHT Centre for Data Analytics |
Published in: | LSC '22: Proceedings of the 5th Annual on Lifelog Search Challenge. . Association for Computing Machinery (ACM). ISBN 978-1-4503-9239-6 |
Publisher: | Association for Computing Machinery (ACM) |
Official URL: | https://dx.doi.org/10.1145/3512729.3533009 |
Copyright Information: | © 2022 The Authors. Open Access (CC-BY 4.0) |
Funders: | Science Foundation Ireland SFI/12/RC/2289-P2 |
ID Code: | 27546 |
Deposited On: | 12 Aug 2022 10:41 by Thomas Murtagh . Last Modified 03 Mar 2023 12:45 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
4MB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record