Skip to content

Training on noisy Flickr labels without annotators for better classification performance in ImageNet classification problem.

Notifications You must be signed in to change notification settings

MLI-lab/imagenet_candidate_training

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flickr Image Recognition

This repository provides code for reproducing the figures in the paper:

"Image recognition from noisy labels collected without annotators", by Fatih Furkan Yilmaz and Reinhard Heckel.

Downloading the dataset

The original ImageNet dataset can be downloaded from the official source given by ILSVRC2012 or by using the Kaggle mirror.

The Flickr candidate dataset is currently not hosted online due to size constraints, however can be re-constructed by providing this file as the positional argument to the code for searching and downloading Flickr images.

Training on the candidate dataset

While the results in the paper can be reproduced by the included log files, they can also be reproduced by training on the candidate dataset with a wide range of hyperparameters.

In order to train on either the candidate dataset constructed from Flickr or the original ImageNet, the training script can be run with the following parameters:

  • root: root directory (pathlib) where the training set is stored.
  • main: name of the folder where either the original ImageNet training set or the candidate dataset is stored.
  • test: name of the folder where the ImageNet validation set is stored.

The rest of the available CLI parameters can be queried by running python train.py --help.

It is also possible to train the model with the same exact setup used in generating the figures in the paper by using the provided configuration files. For example, the results of the main figure for the candidate dataset can be reproduced from scratch simply by running:

python train.py --config "./results/flickr_cls100/config.json"

Visualizing training results

The code for plotting the training results from the log files is given in the visualize_train_logs notebook.

This notebook contains the code for reproducing Figure 2 and Figure 7 (Appendix) from the paper.

Error Analysis

The code for the analysis of the test classification error for the 135-class problem is given in the error_analysis notebook.

This notebook contains the code for reproducing Figure 5-a, 5-b and Figure 8 (Appendix) from the paper.

Source codes

Acknowledgements for PyTorch implementation is as follows:

Training code

About

Training on noisy Flickr labels without annotators for better classification performance in ImageNet classification problem.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published