This is an Pytorch implementation of Xu et al. Knowledge Distillation On the Fly Native Ensemble (ONE) NeurIPS 2018 on Python 2.7, Pytorch 2.0. You may refer to our Vedio and Poster for a quick overview.
- Datasets: CIFAR100, CIFAR10
- Python 2.7.
- Pytorch version == 0.2.0.
you may need change GPU-ID in scripts, “--gpu-id”, the default is 0.
For example, to train the ONE model using ResNet-32
or ResNet-110
on CIFAR100, run the the following scripts.
bash scripts/ONE_ResNet32.sh
bash scripts/ONE_ResNet110.sh
To train baseline model using ResNet-32
or ResNet-110
on CIFAR100, run the the following scripts.
bash scripts/Baseline_ResNet32.sh
bash scripts/Baseline_ResNet110.sh
It may help to ramp up [https://arxiv.org/abs/1703.01780] the KL cost in the beginning over the first few epochs until the teacher network starts giving good predictions.
Please refer to the following if this repository is useful for your research.
@inproceedings{lan2018knowledge,
title={Knowledge Distillation by On-the-Fly Native Ensemble},
author={Lan, Xu and Zhu, Xiatian and Gong, Shaogang},
booktitle={Advances in Neural Information Processing Systems},
pages={7527--7537},
year={2018}
}
This project is licensed under the MIT License - see the LICENSE.md file for details.
This repository is partially built upon the bearpaw/pytorch-classification repository.