Combining image descriptors to effectively retrieve events from visual lifelogs
Doherty, Aiden R. and Ó Conaire, Ciarán and Blighe, Michael and Smeaton, Alan F. and O'Connor, Noel E. (2008) Combining image descriptors to effectively retrieve events from visual lifelogs. In: MIR 2008 - ACM International Conference on Multimedia Information Retrieval, 30-31 October, Vancouver, Canada.
Full text available as:
The SenseCam is a wearable camera that passively captures approximately 3,000 images per day, which equates to almost one million images per year. It is used to create a personal visual recording of the wearer's life and generates information which can be helpful as a human memory aid. For such a large amount of visual information to be of any use, it is accepted that it should be structured into "events", of which there are about 8,000 in a wearer's average year. In automatically segmenting SenseCam images into events, it will then be useful for users to locate other events similar to a given event e.g. "what other times was I walking in the park?", "show me other events when I was in a restaurant". On two datasets of 240k and 1.8M images containing topics with a variety of information needs, we evaluate the fusion of MPEG-7, SIFT, and SURF content-based retrieval techniques to address the event search issue. We have found that our proposed fusion approach of MPEG-7 and SURF offers an improvement on using either of those sources or SIFT individually, and we have also shown how a lifelog event is modeled has a large effect on the retrieval performance.
Archive Staff Only: edit this record