Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Semantic concept detection in imbalanced datasets based on different under-sampling strategies

Guo, Jinlin, Foley, Colum, Gurrin, Cathal orcid logoORCID: 0000-0002-5023-4089 and Lao, Songyang (2011) Semantic concept detection in imbalanced datasets based on different under-sampling strategies. In: International Conference on Multimedia and Expo (ICME) 2011, 11-15 July 2011, Barcelona, Spain.

Abstract
Semantic concept detection is a very useful technique for developing powerful retrieval or filtering systems for multimedia data. To date, the methods for concept detection have been converging on generic classification schemes. However, there is often imbalanced dataset or rare class problems in classification algorithms, which deteriorate the performance of many classifiers. In this paper, we adopt three “under-sampling” strategies to handle this imbalanced dataset issue in a SVM classification framework and evaluate their performances on the TRECVid 2007 dataset and additional positive samples from TRECVid 2010 development set. Experimental results show that our well-designed “under-sampling” methods (method SAK) increase the performance of concept detection about 9.6% overall. In cases of extreme imbalance in the collection the proposed methods worsen the performance than a baseline sampling method (method SI), however in the majority of cases, our proposed methods increase the performance of concept detection substantially. We also conclude that method SAK is a promising solution to address the SVM classification with not extremely imbalanced datasets.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:machine translation
Subjects:Computer Science > Machine learning
Computer Science > Information retrieval
DCU Faculties and Centres:UNSPECIFIED
Published in: Multimedia and Expo (ICME), 2011 IEEE International Conference on. . IEEE.
Publisher:IEEE
Copyright Information:© 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:16487
Deposited On:07 Oct 2011 10:56 by Jinlin Guo . Last Modified 02 Nov 2018 15:38
Documents

Full text available as:

[thumbnail of 390.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
653kB
Metrics

Altmetric Badge

Dimensions Badge

Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record