Exploring the optimal visual vocabulary sizes for semantic concept detection

Guo, Jinlin, Qiu, Zhengwei and Gurrin, Cathal ORCID: 0000-0003-4395-7702 (2013) Exploring the optimal visual vocabulary sizes for semantic concept detection. In: 11th International Workshop on Content-Based Multimedia Indexing, 17-19 June 2013, Veszprém, Hungary.

Abstract
Metadata
Downloads
Documents

[+]

The framework based on the Bag-of-Visual-Words (BoVW) feature representation and SVM classification is popularly used for generic content-based concept detection or visual categorization. However, visual vocabulary (VV) size, one important factor in this framework, is always chosen differently and arbitrarily in previous work. In this paper, we focus on investigating the optimal VV sizes depending on other components of this framework which also govern the performance. This is useful as a default VV size for reducing the computation cost. By unsupervised clustering, a series of VVs covering wide size range are evaluated under two popular local features, three assignment modes, and four kernels on two different scale benchmarking datasets respectively. These factors are also evaluated. Experimental results show that best VV sizes vary as these factors change. However, the concept detection performance usually improves as the VV size increases initially, and then gains less, or even deteriorates if larger VVs are used since overfitting happens. Overall, VVs with sizes ranging from 1024 to 4096 achieve best performance with higher probability when compared with other-size VVs. With regard to the other factors, experimental results show that the OpponentSIFT descriptor outperforms the SURF feature, and soft assignment mode yields better performance than binary and hard assignment. In addition, generalized RBF kernels such as Chi-square and Laplace RBF kernels are more appropriate for semantic concept detection with SVM classification.

Item Type:	Conference or Workshop Item (Speech)
Event Type:	Workshop
Refereed:	Yes
Subjects:	Computer Science > Machine learning Computer Science > Information retrieval Computer Science > Digital video Computer Science > Algorithms Computer Science > Image processing
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:	18078
Deposited On:	17 Jun 2013 10:01 by Jinlin Guo . Last Modified 02 Nov 2018 15:31

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 3.0
180kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Exploring the optimal visual vocabulary sizes for semantic concept detection

Downloads