Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Addressing out-of-distribution label noise in webly-labelled data

Albert, Paul, Ortego, Diego orcid logoORCID: 0000-0002-1011-3610, Arazo, Eric, O'Connor, Noel E. orcid logoORCID: 0000-0002-4033-9135 and McGuinness, Kevin orcid logoORCID: 0000-0003-1336-6477 (2022) Addressing out-of-distribution label noise in webly-labelled data. In: Winter Conference on Applications of Computer Vision (WACV), 4-8 Jan 2022, Kona, Hawaii.

Abstract
A recurring focus of the deep learning community is to- wards reducing the labeling effort. Data gathering and annotation using a search engine is a simple alternative to generating a fully human-annotated and human-gathered dataset. Although web crawling is very time efficient, some of the retrieved images are unavoidably noisy, i.e. incor- rectly labeled. Designing robust algorithms for training on noisy data gathered from the web is an important research perspective that would render the building of datasets eas- ier. In this paper we conduct a study to understand the type of label noise to expect when building a dataset using a search engine. We review the current limitations of state- of-the-art methods for dealing with noisy labels for image classification tasks in the case of web noise distribution. We propose a simple solution to bridge the gap with a fully clean dataset using Dynamic Softening of Out-of-distribution Sam- ples (DSOS), which we design on corrupted versions of the CIFAR-100 dataset, and compare against state-of-the-art algorithms on the web noise perturbated MiniImageNet and Stanford datasets and on real label noise datasets: WebVi- sion 1.0 and Clothing1M. Our work is fully reproducible https://git.io/JKGcj.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:Computer Vision; Web crawled dataset; Webvision; Neural networks; Out-of-distribution images; Noisy datasets.
Subjects:Computer Science > Image processing
Computer Science > Machine learning
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering
Published in: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). . IEEE.
Publisher:IEEE
Official URL:https://dx.doi.org/10.1109/WACV51458.2022.00245
Copyright Information:© 2022 The Authors
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:26405
Deposited On:06 Jan 2022 17:16 by Paul Albert . Last Modified 25 Apr 2022 14:32
Documents

Full text available as:

[thumbnail of 2110.13699.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
4MB
Metrics

Altmetric Badge

Dimensions Badge

Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record