Alateeq, Ahmed ORCID: 0000-0001-7916-6393, Mark, Roantree ORCID: 0000-0002-1329-2570 and Gurrin, Cathal ORCID: 0000-0003-2903-3968 (2024) Voxento-Pro: An Advanced Voice Lifelog Retrieval Interaction for Multimodal Lifelogs. In: ICMR '24: International Conference on Multimedia Retrieval, 10 Jun 2024, Phuket, Thailand. ISBN 979-8-4007-0550-2/24/06
Abstract
We present an advanced version called Voxento-Pro which is an
interactive voice-based lifelog retrieval system. This system has
been developed to participate in the seventh ACM Lifelog Search
Challenge LSC’24, at ICMR’24 in Thailand. In Voxento-Pro, we
introduce a conversational query methodology by utilising OpenAI’s Assistant API and employ OpenAI’s Whisper technology
for state-of-the-art speech recognition and synthesis. This novel
version features a more natural interaction mechanism, which enhances the user’s experience. In addition, the user interface (UI) was
redesigned and introduced a new chat interface and other components. The backend retrieval API was rebuilt with a new technology
to support fast and efficient API interactions. Data processing of the
lifelog data resulted in about 20% of non-important images being
identified and 27% of missing data being filled with Geocoding APIs.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | lifelog; interactive retrieval; voice interaction; speech recognition; conversational search |
Subjects: | Computer Science > Information retrieval Computer Science > Lifelog |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > INSIGHT Centre for Data Analytics |
Published in: | Gurrin, Cathal and Jónsson, Björn Þór, (eds.) Proceedings of the 7th Annual Workshop on Lifelog Search Challenge (LSC'24). . Association for Computing Machinery (ACM). ISBN 979-8-4007-0550-2/24/06 |
Publisher: | Association for Computing Machinery (ACM) |
Official URL: | https://doi.org/10.1145/3643489.3661130 |
Copyright Information: | © 2024 The Authors. |
Funders: | Ministry of Education in Saudi Arabia, Science Foundation Ireland and the Insight Centre for Data Analytics through the grant number SFI/12/RC/2289-P2, The Vistamilk SFI Research Centre (SFI/16/RC/3835) |
ID Code: | 30303 |
Deposited On: | 09 Sep 2024 09:03 by Ahmed Alateeq . Last Modified 09 Sep 2024 09:03 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0 2MB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record