This repository contains code for the CNN experiments presented in the paper along with more functionalities.
This code base is built upon the STR modified for STR-BN.
- Clone this repository.
- Using
Python 3.6
, create avenv
withpython -m venv myenv
and runsource myenv/bin/activate
. You can also useconda
to create a virtual environment. - Install requirements with
pip install -r requirements.txt
forvenv
and appropriateconda
commands forconda
environment. - Create a data directory
<data-dir>
. To run the ImageNet experiments there must be a folder<data-dir>/imagenet
that contains the ImageNettrain
andval
.
STR-BN
. Users can take STR-BN
and use it in most of the PyTorch based models as it inherits from nn.BatchNorm2d
or also mentioned here as LearnedBatchNorm
. The hyperparameters of STR-BN
which includes the sparseFunction
are not well explored to provide the users with default settings. This is experimental code and contributions are welcome.
This codebase contains model architectures for ResNet18, ResNet50 and MobileNetV1 and support to train them on ImageNet-1K. We have provided some config
files for training ResNet50 and MobileNetV1 which can be modified for other architectures and datasets. To support more datasets, please add new dataloaders to data
folder.
Training across multiple GPUs is supported, however, the user should check the minimum number of GPUs required to scale ImageNet-1K.
ResNet50: python main.py --config configs/largescale/resnet50-dense.yaml --multigpu 0,1,2,3
MobileNetV1: python main.py --config configs/largescale/mobilenetv1-dense.yaml --multigpu 0,1,2,3
Train models with STR-BN on ImageNet-1K:
ResNet50: python main.py --config configs/largescale/resnet50-str-bn.yaml --multigpu 0,1,2,3
MobileNetV1: python main.py --config configs/largescale/mobilenetv1-str-bn.yaml --multigpu 0,1,2,3
The user can explore and search for right hyperparameters of STR-BN
through the configs
.
The folder budgets
contains the csv files containing all the non-uniform sparsity budgets STR learnt for ResNet50 on ImageNet-1K across all the sparsity regimes along with baseline budgets for 90% sparse ResNet50 on ImageNet-1K. In case, you are not able to use the pretraining models to extract sparsity budgets, you can directly import the same budgets using these files. Structured sparsity methods which take in a layer-wise sparsity budget could potentially utilize these budgets learnt through STR for unstructured sparsity.
If you find this project useful in your research, please consider citing:
@article{Kusupati20a
author = {Kusupati, Aditya},
title = {Adapting Unstructured Sparsity Techniques for Structured Sparsity},
booktitle = {Technical Report},
year = {2020},
}