Nguyen, Thao Nhu (2025) Moments in Focus: A Transformer-based Approach for Moment-centric Lifelog Retrieval. PhD thesis, Dublin City University.
Abstract
Lifelogging, the process of creating a personal digital record of daily activities, is gaining popularity as a form of digital diary. With this rise, the increasing demand for effective methods to extract specific activities from large, multimodal datasets becomes more crucial. This research explores novel approaches to improve information retrieval from personal lifelog data, focusing on moment-based retrieval to approximate human memory recall. This thesis begins with the development of LifeSeeker, a baseline image-based lifelog search engine utilising image concept-based indexing and retrieval techniques. Building on this foundation, the research leverages recent advancements in multimodal embedding models, particularly the Contrastive Language-Image Pre-training (CLIP) from OpenAI, to enhance search capabilities. Through interactive user studies, the embedding-based enhanced retrieval model demonstrates significant improvements over the conventional concept-based approach across various evaluation metrics. Recognising that humans encode and retrieve their experiences as sequences of events, we propose a shift from single-image-based to moment-based retrieval units in lifelog data, more closely aligning with human memory processes. For this purpose, I investigate the application of video moment retrieval techniques to lifelog data, exploiting similarities between video and lifelog data as continuous frame sequences. This led to the development of a novel Parallel Transformer Framework (PaTF) for moment-level retrieval from continuous visual streams. In particular, the PaTF combines the strengths of the transformer architecture with enriched multimodal embeddings (visual and semantic features) to capture the temporal context of lifelog moments. The integration of the PaTF into a comprehensive moment-based lifelog retrieval system demonstrates significant improvements in retrieval effectiveness compared to the baseline LifeSeeker. In summary, the primary contribution of this thesis is the development and validation of a moment-based lifelog search engine utilising a transformer-based framework. This approach is expected to advance the field of lifelog retrieval and benefit personal information management, memory augmentation, and retrospective life analysis.
Metadata
| Item Type: | Thesis (PhD) |
|---|---|
| Date of Award: | 29 July 2025 |
| Refereed: | No |
| Supervisor(s): | Gurrin, Cathal, Zhou, Liting, Mai, Tai Tan and Caputo, Annalina |
| Subjects: | Computer Science > Computer engineering Computer Science > Computer networks Computer Science > Lifelog |
| DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT |
| Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 License. View License |
| Funders: | Research Ireland |
| ID Code: | 31351 |
| Deposited On: | 21 Nov 2025 13:56 by Cathal Gurrin . Last Modified 21 Nov 2025 13:56 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial-No Derivative Works 3.0 46MB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record