Mohedano, Eva, Salvador, Amaia, McGuinness, KevinORCID: 0000-0003-1336-6477, Giró-i-Nieto, XavierORCID: 0000-0002-9935-5332, O'Connor, Noel E.ORCID: 0000-0002-4033-9135 and Marqués, Ferran
(2017)
Object retrieval with deep convolutional features.
In: Hemanth, D. Jude and Estrela, Vania Vieira, (eds.)
Deep Learning for Image Processing Applications.
Advances in Parallel Computing, 31
.
IOS Press Ebooks, Amsterdam, The Netherlands, pp. 137-163.
ISBN 978-1-61499-821-1
Image representations extracted from convolutional neural networks (CNNs) outdo hand-crafted features in several computer vision tasks, such as visual image retrieval. This chapter recommends a simple pipeline for encoding the local activations of a convolutional layer of a pre- trained CNN utilizing the well-known Bag of Words (BoW) aggregation scheme and called bag of local convolutional features (BLCF). Matching each local array of activations in a convolutional layer to a visual word results in an assignment map, which is a compact representation relating regions of an image with a visual word. We use the assignment map for fast spatial reranking, finding object localizations that are used for query expansion. We show the suitability of the BoW representation based on local CNN features for image retrieval, attaining state-of-the- art performance on the Oxford and Paris buildings benchmarks. We demonstrate that the BLCF system outperforms the latest procedures using sum pooling for a subgroup of the challenging TRECVid INS benchmark according to the mean Average Precision (mAP) metric.
Item Type:
Book Section
Refereed:
Yes
Uncontrolled Keywords:
Information Storage and Retrieval; Content Analysis and Indexing; Image Processing and Computer Vision; Feature Representation; Convolutional Neural Networks; Deep Learning; Bag of Words