Skip to content

Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning

Notifications You must be signed in to change notification settings

reactivetype/DivideMix

 
 

Repository files navigation

DivideMix: Learning with Noisy Labels as Semi-supervised Learning

PyTorch Code for the following paper:
Title: DivideMix: Learning with Noisy Labels as Semi-supervised Learning [pdf]
Authors:Junnan Li, Steven C.H. Hoi, Richard Socher
Institute: Salesforce Research

Abstract
Deep neural networks are known to be annotation-hungry. Numerous efforts have been devoted to reduce the annotation cost when learning with deep networks. Two prominent directions include learning with noisy labels and semi-supervised learning by exploiting unlabeled data. In this work, we propose DivideMix, a novel framework for learning with noisy labels by leveraging semi-supervised learning techniques. In particular, DivideMix models the per-sample loss distribution with a mixture model to dynamically divide the training data into a labeled set with clean samples and an unlabeled set with noisy samples, and trains the model on both the labeled and unlabeled data in a semi-supervised manner. To avoid confirmation bias, we simultaneously train two diverged networks where each network uses the dataset division from the other network. During the semi-supervised training phase, we improve the MixMatch strategy by performing label co-refinement and label co-guessing on labeled and unlabeled samples, respectively. Experiments on multiple benchmark datasets demonstrate substantial improvements over state-of-the-art methods.

Illustration

Experiments
First, please create a folder named checkpoint to store the results.
Next, run python Train_xx.py --data_path path-to-your-data

Cite DivideMix
If you find the code useful in your research, please consider citing our paper:

@inproceedings{
li2020dividemix,
title={DivideMix: Learning with Noisy Labels as Semi-supervised Learning},
author={Junnan Li and Steven C.H. Hoi and Richard Socher},
booktitle={International Conference on Learning Representations},
year={2020},
}

About

Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%