ResNeXt: Aggregated Residual Transformations for Deep Neural Networks

By Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, Kaiming He

UC San Diego, Facebook AI Research

Congrats to the ILSVRC 2017 classification challenge winner WMW. ResNeXt is the foundation of their new SENet architecture (a ResNeXt-152 (64 x 4d) with the Squeeze-and-Excitation module)!
Check out Figure 6 in the new Memory-Efficient Implementation of DenseNets paper for a comparision between ResNeXts and DenseNets. _{（DenseNet cosine is DenseNet trained with cosine learning rate schedule.）}

Introduction

This repository contains a Torch implementation for the ResNeXt algorithm for image classification. The code is based on fb.resnet.torch.

ResNeXt is a simple, highly modularized network architecture for image classification. Our network is constructed by repeating a building block that aggregates a set of transformations with the same topology. Our simple design results in a homogeneous, multi-branch architecture that has only a few hyper-parameters to set. This strategy exposes a new dimension, which we call “cardinality” (the size of the set of transformations), as an essential factor in addition to the dimensions of depth and width.

Figure: Training curves on ImageNet-1K. (Left): ResNet/ResNeXt-50 with the same complexity (~4.1 billion FLOPs, ~25 million parameters); (Right): ResNet/ResNeXt-101 with the same complexity (~7.8 billion FLOPs, ~44 million parameters).

Citation

If you use ResNeXt in your research, please cite the paper:

@article{Xie2016,
  title={Aggregated Residual Transformations for Deep Neural Networks},
  author={Saining Xie and Ross Girshick and Piotr Dollár and Zhuowen Tu and Kaiming He},
  journal={arXiv preprint arXiv:1611.05431},
  year={2016}
}

Requirements and Dependencies

See the fb.resnet.torch installation instructions for a step-by-step guide.

Install Torch on a machine with CUDA GPU
Install cuDNN v4 or v5 and the Torch cuDNN bindings
Download the ImageNet dataset and move validation images to labeled subfolders

Training

Please follow fb.resnet.torch for the general usage of the code, including how to use pretrained ResNeXt models for your own task.

There are two new hyperparameters need to be specified to determine the bottleneck template:

-baseWidth and -cardinality

1x Complexity Configurations Reference Table

baseWidth	cardinality
64	1
40	2
24	4
14	8
4	32

To train ResNeXt-50 (32x4d) on 8 GPUs for ImageNet:

th main.lua -dataset imagenet -bottleneckType resnext_C -depth 50 -baseWidth 4 -cardinality 32 -batchSize 256 -nGPU 8 -nThreads 8 -shareGradInput true -data [imagenet-folder]

To reproduce CIFAR results (e.g. ResNeXt 16x64d for cifar10) on 8 GPUs:

th main.lua -dataset cifar10 -bottleneckType resnext_C -depth 29 -baseWidth 64 -cardinality 16 -weightDecay 5e-4 -batchSize 128 -nGPU 8 -nThreads 8 -shareGradInput true

To get comparable results using 2/4 GPUs, you should change the batch size and the corresponding learning rate:

th main.lua -dataset cifar10 -bottleneckType resnext_C -depth 29 -baseWidth 64 -cardinality 16 -weightDecay 5e-4 -batchSize 64 -nGPU 4 -LR 0.05 -nThreads 8 -shareGradInput true
th main.lua -dataset cifar10 -bottleneckType resnext_C -depth 29 -baseWidth 64 -cardinality 16 -weightDecay 5e-4 -batchSize 32 -nGPU 2 -LR 0.025 -nThreads 8 -shareGradInput true

Note: CIFAR datasets will be automatically downloaded and processed for the first time. Note that in the arXiv paper CIFAR results are based on pre-activated bottleneck blocks and a batch size of 256. We found that better CIFAR test acurracy can be achieved using original bottleneck blocks and a batch size of 128.

ImageNet Pretrained Models

ImageNet pretrained models are licensed under CC BY-NC 4.0.

Single-crop (224x224) validation error rate

Network	GFLOPS	Top-1 Error	Download
ResNet-50 (1x64d)	~4.1	23.9	Original ResNet-50
ResNeXt-50 (32x4d)	~4.1	22.2	Download (191MB)
ResNet-101 (1x64d)	~7.8	22.0	Original ResNet-101
ResNeXt-101 (32x4d)	~7.8	21.2	Download (338MB)
ResNeXt-101 (64x4d)	~15.6	20.4	Download (638MB)

Third-party re-implementations

Besides our torch implementation, we recommend to see also the following third-party re-implementations and extensions:

Training code in PyTorch code
Converting ImageNet pretrained model to PyTorch model and source. code
Training code in MXNet and pretrained ImageNet models code
Caffe prototxt, pretrained ImageNet models (with ResNeXt-152), curves code code

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
datasets		datasets
models		models
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
checkpoints.lua		checkpoints.lua
dataloader.lua		dataloader.lua
main.lua		main.lua
opts.lua		opts.lua
train.lua		train.lua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ResNeXt: Aggregated Residual Transformations for Deep Neural Networks

Table of Contents

News

Introduction

Figure: Training curves on ImageNet-1K. (Left): ResNet/ResNeXt-50 with the same complexity (~4.1 billion FLOPs, ~25 million parameters); (Right): ResNet/ResNeXt-101 with the same complexity (~7.8 billion FLOPs, ~44 million parameters).

Citation

Requirements and Dependencies

Training

1x Complexity Configurations Reference Table

ImageNet Pretrained Models

Single-crop (224x224) validation error rate

Third-party re-implementations

About

Releases

Packages

Contributors 3

Languages

License

facebookresearch/ResNeXt

Folders and files

Latest commit

History

Repository files navigation

ResNeXt: Aggregated Residual Transformations for Deep Neural Networks

Table of Contents

News

Introduction

Figure: Training curves on ImageNet-1K. (Left): ResNet/ResNeXt-50 with the same complexity (~4.1 billion FLOPs, ~25 million parameters); (Right): ResNet/ResNeXt-101 with the same complexity (~7.8 billion FLOPs, ~44 million parameters).

Citation

Requirements and Dependencies

Training

1x Complexity Configurations Reference Table

ImageNet Pretrained Models

Single-crop (224x224) validation error rate

Third-party re-implementations

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages