Att-DARTS: Differentiable Neural Architecture Search for Attention

The PyTorch implementation of Att-DARTS: Differentiable Neural Architecture Search for Attention.

The codes are based on https://github.com/dragen1860/DARTS-PyTorch.

Requirements

Python == 3.7
PyTorch == 1.0.1
torchvision == 0.2.2
pillow == 6.2.1
numpy
graphviz
requests
tqdm

We recommend downloading PyTorch from here.

Datasets

CIFAR-10/100: automatically downloaded by torchvision to data folder.
ImageNet (ILSVRC2012 version): manually downloaded following the instructions here.

Results

CIFAR

	CIFAR-10	CIFAR-100	Params(M)
DARTS	2.76 ± 0.09	16.69 ± 0.28	3.3
Att-DARTS	2.54 ± 0.10	16.54 ± 0.40	3.2

ImageNet

	top-1	top-5	Params(M)
DARTS	26.7	8.7	4.7
Att-DARTS	26.0	8.5	4.6

Usage

Architecture search (using small proxy models)

Our script occupies all available GPUs. Please set environment CUDA_VISIBLE_DEVICES.

To carry out architecture search using 2nd-order approximation, run:

python train_search.py --unrolled

The found cell will be saved in genotype.json. Our resultant Att_DARTS is written in genotypes.py.

Inserting an attention at other locations is supported through the --location flag. The locations are specified at AttLocation in model_search.py.

Architecture evaluation (using full-sized models)

To evaluate our best cells by training from scratch, run:

python train_CIFAR10.py --auxiliary --cutout --arch Att_DARTS  # CIFAR-10
python train_CIFAR100.py --auxiliary --cutout --arch Att_DARTS  # CIFAR-100
python train_ImageNet.py --auxiliary --arch Att_DARTS  # ImageNet

Customized architectures are supported through the --arch flag once specified in genotypes.py.

Also, you can designate the search result in .json through the --arch_path flag:

python train_CIFAR10.py --auxiliary --cutout --arch_path ${PATH}  # CIFAR-10
python train_CIFAR100.py --auxiliary --cutout --arch_path ${PATH}  # CIFAR-100
python train_ImageNet.py --auxiliary --arch_path ${PATH}  # ImageNet

where ${PATH} should be replaced by the path to the .json.

The trained model is saved in trained.pt. After training, the test script automatically runs.

Also, you can always test the trained.pt as indicated below.

Test (using full-sized pretrained models)

To test a pretrained model saved in .pt , run:

python test_CIFAR10.py --auxiliary --model_path ${PATH} --arch Att_DARTS  # CIFAR-10
python test_CIFAR100.py --auxiliary --model_path ${PATH} --arch Att_DARTS  # CIFAR-100
python test_imagenet.py --auxiliary --model_path ${PATH} --arch Att_DARTS  # ImageNet

where ${PATH} should be replaced by the path to .pt.

You can designate our pretrained models (cifar10_att.pt, cifar100_att.pt, imagenet_att.pt) or the saved trained.pt in Architecture Evaluation.

Also, we support customized architectures specified in genotypes.py through the --arch flag, or architectures specified in .json through the --arch_path flag.

Visualization

You can visualize the found cells in genotypes.py. For example, you can visualize Att-DARTS running:

python visualize.py Att_DARTS

Also, you can visualize the saved cell in .json:

python visualize.py genotype.json

Related Work

Attention modules

This repository includes the following attentions:

Squeeze-and-Excitation (paper / code (unofficial))
Gather-Excite (paper / code (unofficial))
BAM (paper / code)
CBAM (paper / code)
A²-Nets (paper / code (unofficial))

Reference

@inproceedings{att-darts2020IJCNN,
author = {Nakai, Kohei and Matsubara, Takashi and Uehara, Kuniaki},
booktitle = {The International Joint Conference on Neural Networks (IJCNN)},
title = {{Att-DARTS: Differentiable Neural Architecture Search for Attention}},
year = {2020}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
Pipfile		Pipfile
README.md		README.md
arch.py		arch.py
attentions.py		attentions.py
cifar100_att.pt		cifar100_att.pt
cifar10_att.pt		cifar10_att.pt
constants.py		constants.py
genotypes.py		genotypes.py
imagenet_att.pt		imagenet_att.pt
model.py		model.py
model_search.py		model_search.py
operations.py		operations.py
test_CIFAR10.py		test_CIFAR10.py
test_CIFAR100.py		test_CIFAR100.py
test_imagenet.py		test_imagenet.py
tester.py		tester.py
train_CIFAR10.py		train_CIFAR10.py
train_CIFAR100.py		train_CIFAR100.py
train_imagenet.py		train_imagenet.py
train_search.py		train_search.py
trainer.py		trainer.py
utils.py		utils.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Att-DARTS: Differentiable Neural Architecture Search for Attention

Requirements

Datasets

Results

CIFAR

ImageNet

Usage

Architecture search (using small proxy models)

Architecture evaluation (using full-sized models)

Test (using full-sized pretrained models)

Visualization

Related Work

Attention modules

Reference

About

Releases

Packages

Languages

chomin/Att-DARTS

Folders and files

Latest commit

History

Repository files navigation

Att-DARTS: Differentiable Neural Architecture Search for Attention

Requirements

Datasets

Results

CIFAR

ImageNet

Usage

Architecture search (using small proxy models)

Architecture evaluation (using full-sized models)

Test (using full-sized pretrained models)

Visualization

Related Work

Attention modules

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages