The growing attention to lifelogging research has led to the creation of many retrieval systems, most of which employed event segmentation as core functionality. While previous literature focused on splitting lifelog data into broad segments of daily living activities, less attention was paid to micro-activities which last for short periods of time, yet carry valuable information for building a high-precision retrieval engine. In this paper, we present our efforts in addressing the NTCIR-15 MART challenge, in which the participants were asked to retrieve micro-activities from a multi-modal dataset. We proposed five models which investigate imagery and sensory data, both jointly and separately using various Deep Learn- ing and Machine Learning techniques, and achieved a maximum mAP score of 0.901 using an Image Tabular Pair-wise Similarity model, and overall ranked second in the competition. Our model not only captures the information coming from the temporal visual data combined with sensor signal, but also works as a Siamese network to discriminate micro-activities.
This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:
Science Foundation Ireland under grant numbers SFI/12/RC/2289_2, SFI/13/RC/2106, 18/CRT/6223, and 18/CRT/6224, Dublin City University’s Research Committee
ID Code:
26443
Deposited On:
05 Nov 2021 11:36 by
Manh Duy Nguyen
. Last Modified 21 Jul 2022 11:28