Guo, Jinlin, Foley, Colum, Gurrin, Cathal ORCID: 0000-0002-5023-4089 and Lao, Songyang (2011) Semantic concept detection in imbalanced datasets based on different under-sampling strategies. In: International Conference on Multimedia and Expo (ICME) 2011, 11-15 July 2011, Barcelona, Spain.
Abstract
Semantic concept detection is a very useful technique for developing powerful retrieval or filtering systems for multimedia data. To date, the methods for concept detection have been converging on generic classification schemes. However, there is often imbalanced dataset or rare class problems in classification algorithms, which deteriorate the performance of many classifiers. In this paper, we adopt three “under-sampling” strategies to handle this imbalanced dataset issue in a SVM classification framework and evaluate their performances
on the TRECVid 2007 dataset and additional positive
samples from TRECVid 2010 development set. Experimental
results show that our well-designed “under-sampling” methods
(method SAK) increase the performance of concept detection
about 9.6% overall. In cases of extreme imbalance in
the collection the proposed methods worsen the performance
than a baseline sampling method (method SI), however in the
majority of cases, our proposed methods increase the performance of concept detection substantially. We also conclude that method SAK is a promising solution to address the SVM classification with not extremely imbalanced datasets.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | machine translation |
Subjects: | Computer Science > Machine learning Computer Science > Information retrieval |
DCU Faculties and Centres: | UNSPECIFIED |
Published in: | Multimedia and Expo (ICME), 2011 IEEE International Conference on. . IEEE. |
Publisher: | IEEE |
Copyright Information: | © 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
ID Code: | 16487 |
Deposited On: | 07 Oct 2011 10:56 by Jinlin Guo . Last Modified 02 Nov 2018 15:38 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
653kB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record