open-mmlab · sunnyxiaohu · Dec 16, 2022 · Dec 16, 2022 · Dec 16, 2022 · sunnyxiaohu
diff --git a/configs/nas/mmcls/autoformer/README.md b/configs/nas/mmcls/autoformer/README.md
@@ -21,23 +21,31 @@ the performance on downstream benchmarks and distillation experiments.
 
 ![pipeline](/docs/en/imgs/model_zoo/autoformer/pipeline.png)
 
-## Introduction
+## Get Started
 
-### Supernet pre-training on ImageNet
+### Step 1: Supernet pre-training on ImageNet
 
 ```bash
-python ./tools/train.py \
-  configs/nas/mmcls/autoformer/autoformer_supernet_32xb256_in1k.py \
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/mmcls/autoformer/autoformer_supernet_32xb256_in1k.py 4 \
   --work-dir $WORK_DIR
 ```
 
-### Search for subnet on the trained supernet
+### Step 2: Search for subnet on the trained supernet
 
 ```bash
-sh tools/train.py \
-  configs/nas/mmcls/autoformer/autoformer_search_8xb128_in1k.py \
-  $STEP1_CKPT \
-  --work-dir $WORK_DIR
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/mmcls/autoformer/autoformer_search_8xb128_in1k.py 4 \
+  --work-dir $WORK_DIR --cfg-options load_from=$STEP1_CKPT
+```
+
+### Step 3: Subnet inference on ImageNet
+
+```bash
+CUDA_VISIBLE_DEVICES=0 PORT=29500 ./tools/dist_test.sh \
+  configs/nas/mmcls/autoformer/autoformer_subnet_8xb128_in1k.py \
+  $STEP2_CKPT 1 --work-dir $WORK_DIR \
+  --cfg-options algorithm.mutable_cfg=$STEP2_SUBNET_YAML
 ```
 
 ## Results and models

diff --git a/configs/nas/mmcls/autoslim/README.md b/configs/nas/mmcls/autoslim/README.md
@@ -11,51 +11,50 @@ Notably, by setting optimized channel numbers, our AutoSlim-MobileNet-v2 at 305M
 
 ![pipeline](https://user-images.githubusercontent.com/88702197/187425354-d90e4b36-e033-4dc0-b951-64a536e61b71.png)
 
-## Introduction
+## Get Started
 
 ### Supernet pre-training on ImageNet
 
-<pre>
-python ./tools/mmcls/train_mmcls.py \
-  configs/pruning/autoslim/autoslim_mbv2_supernet_8xb256_in1k.py \
-  --work-dir <em>your_work_dir</em>
-</pre>
+```bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/pruning/autoslim/autoslim_mbv2_supernet_8xb256_in1k.py 4 \
+  --work-dir $WORK_DIR
+```
 
 ### Search for subnet on the trained supernet
 
-<pre>
-python ./tools/mmcls/search_mmcls.py \
-  configs/pruning/autoslim/autoslim_mbv2_search_8xb1024_in1k.py \
-  <em>your_pre-training_checkpoint_path</em> \
-  --work-dir <em>your_work_dir</em>
-</pre>
+```bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/pruning/autoslim/autoslim_mbv2_search_8xb1024_in1k.py 4 \
+  --work-dir $WORK_DIR --cfg-options load_from=$STEP1_CKPT
+```
 
 ### Subnet retraining on ImageNet
 
-<pre>
-python ./tools/mmcls/train_mmcls.py \
-  configs/pruning/autoslim/autoslim_mbv2_subnet_8xb256_in1k.py \
-  --work-dir <em>your_work_dir</em> \
+```bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/pruning/autoslim/autoslim_mbv2_subnet_8xb256_in1k.py 4 \
+  --work-dir $WORK_DIR \
   --cfg-options algorithm.channel_cfg=configs/pruning/autoslim/AUTOSLIM_MBV2_530M_OFFICIAL.yaml,configs/pruning/autoslim/AUTOSLIM_MBV2_320M_OFFICIAL.yaml,configs/pruning/autoslim/AUTOSLIM_MBV2_220M_OFFICIAL.yaml
-</pre>
+```
 
 ### Split checkpoint
 
-<pre>
+```bash
 python ./tools/model_converters/split_checkpoint.py \
   configs/pruning/autoslim/autoslim_mbv2_subnet_8xb256_in1k.py \
-  <em>your_retraining_checkpoint_path</em> \
+  $RETRAINED_CKPT \
   --channel-cfgs configs/pruning/autoslim/AUTOSLIM_MBV2_530M_OFFICIAL.yaml configs/pruning/autoslim/AUTOSLIM_MBV2_320M_OFFICIAL.yaml configs/pruning/autoslim/AUTOSLIM_MBV2_220M_OFFICIAL.yaml
-</pre>
+```
 
-### Test a subnet
+### Subnet inference
 
-<pre>
-python ./tools/mmcls/test_mmcls.py \
+```bash
+CUDA_VISIBLE_DEVICES=0 PORT=29500 ./tools/dist_test.sh \
   configs/pruning/autoslim/autoslim_mbv2_subnet_8xb256_in1k.py \
-  <em>your_splitted_checkpoint_path</em> --metrics accuracy \
+  $SEARCHED_CKPT 1 --work-dir $WORK_DIR \
   --cfg-options algorithm.channel_cfg=configs/pruning/autoslim/AUTOSLIM_MBV2_530M_OFFICIAL.yaml  # or modify the config directly
-</pre>
+```
 
 ## Results and models
 

diff --git a/configs/nas/mmcls/bignas/README.md b/configs/nas/mmcls/bignas/README.md
@@ -8,30 +8,30 @@
 
 Neural architecture search (NAS) has shown promising results discovering models that are both accurate and fast. For NAS, training a one-shot model has become a popular strategy to rank the relative quality of different architectures (child models) using a single set of shared weights. However, while one-shot model weights can effectively rank different network architectures, the absolute accuracies from these shared weights are typically far below those obtained from stand-alone training. To compensate, existing methods assume that the weights must be retrained, finetuned, or otherwise post-processed after the search is completed. These steps significantly increase the compute requirements and complexity of the architecture search and model deployment. In this work, we propose BigNAS, an approach that challenges the conventional wisdom that post-processing of the weights is necessary to get good prediction accuracies. Without extra retraining or post-processing steps, we are able to train a single set of shared weights on ImageNet and use these weights to obtain child models whose sizes range from 200 to 1000 MFLOPs. Our discovered model family, BigNASModels, achieve top1 accuracies ranging from 76.5% to 80.9%, surpassing state-of-the-art models in this range including EfficientNets and Once-for-All networks without extra retraining or post-processing. We present ablative study and analysis to further understand the proposed BigNASModels.
 
-## Introduction
+## Get Started
 
 ### Step 1: Supernet pre-training on ImageNet
 
 ```bash
-sh tools/slurm_train.sh $PARTITION $JOB_NAME \
-  configs/nas/mmcls/bignas/attentive_mobilenet_supernet_32xb64_in1k.py \
-  $WORK_DIR
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/mmcls/bignas/attentive_mobilenet_supernet_32xb64_in1k.py 4 \
+  --work-dir $WORK_DIR
 ```
 
 ### Step 2: Search for subnet on the trained supernet
 
 ```bash
-sh tools/slurm_train.sh $PARTITION $JOB_NAME \
-  configs/nas/mmcls/bignas/attentive_mobilenet_search_8xb128_in1k.py \
-  --checkpoint $STEP1_CKPT --work-dir $WORK_DIR
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/mmcls/bignas/attentive_mobilenet_search_8xb128_in1k.py 4 \
+  --work-dir $WORK_DIR --cfg-options load_from=$STEP1_CKPT
 ```
 
-### Step 3: Subnet test on ImageNet
+### Step 3: Subnet inference on ImageNet
 
 ```bash
-sh tools/slurm_test.sh $PARTITION $JOB_NAME \
+CUDA_VISIBLE_DEVICES=0 PORT=29500 ./tools/dist_test.sh \
   configs/nas/mmcls/bignas/attentive_mobilenet_subnet_8xb256_in1k.py \
-  $STEP2_CKPT --work-dir $WORK_DIR --eval accuracy \
+  $STEP2_CKPT 1 --work-dir $WORK_DIR \
   --cfg-options algorithm.mutable_cfg=$STEP2_SUBNET_YAML
 ```
 

diff --git a/configs/nas/mmcls/darts/README.md b/configs/nas/mmcls/darts/README.md
@@ -10,6 +10,24 @@ This paper addresses the scalability challenge of architecture search by formula
 
 ![pipeline](https://user-images.githubusercontent.com/88702197/187425171-2dfe7fbf-7c2c-4c22-9219-2234aa83e47d.png)
 
+## Get Started
+
+### Supernet training on Cifar-10
+
+```bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/mmcls/darts/darts_supernet_unroll_1xb96_cifar10.py 4 \
+  --work-dir $WORK_DIR
+```
+
+## Subnet inference on Cifar-10
+
+```bash
+CUDA_VISIBLE_DEVICES=0 PORT=29500 ./tools/dist_test.sh \
+  configs/nas/mmcls/darts/darts_subnet_1xb96_cifar10_2.0.py \
+  $STEP1_CKPT 1 --work-dir $WORK_DIR
+```
+
 ## Results and models
 
 ### Supernet

diff --git a/configs/nas/mmcls/dsnas/README.md b/configs/nas/mmcls/dsnas/README.md
@@ -12,6 +12,24 @@ Based on this observation, DSNAS proposes a task-specific end-to-end differentia
 
 ![pipeline](/docs/en/imgs/model_zoo/dsnas/pipeline.jpg)
 
+## Get Started
+
+### Supernet training on ImageNet
+
+```bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/mmcls/dsnas/dsnas_supernet_8xb128_in1k.py 4 \
+  --work-dir $WORK_DIR
+```
+
+## Subnet inference on ImageNet
+
+```bash
+CUDA_VISIBLE_DEVICES=0 PORT=29500 ./tools/dist_test.sh \
+  configs/nas/mmcls/dsnas/dsnas_subnet_8xb128_in1k.py \
+  $STEP1_CKPT 1 --work-dir $WORK_DIR
+```
+
 ## Results and models
 
 ### Supernet

diff --git a/configs/nas/mmcls/onceforall/README.md b/configs/nas/mmcls/onceforall/README.md
@@ -8,16 +8,16 @@
 
 We address the challenging problem of efficient inference across many devices and resource constraints, especially on edge devices. Conventional approaches either manually design or use neural architecture search (NAS) to find a specialized neural network and train it from scratch for each case, which is computationally prohibitive (causing CO2 emission as much as 5 cars’ lifetime Strubell et al. (2019)) thus unscalable. In this work, we propose to train a once-for-all (OFA) network that supports diverse architectural settings by decoupling training and search, to reduce the cost. We can quickly get a specialized sub-network by selecting from the OFA network without additional training. To efficiently train OFA networks, we also propose a novel progressive shrinking algorithm, a generalized pruning method that reduces the model size across many more dimensions than pruning (depth, width, kernel size, and resolution). It can obtain a surprisingly large number of sub- networks (> 1019) that can fit different hardware platforms and latency constraints while maintaining the same level of accuracy as training independently. On diverse edge devices, OFA consistently outperforms state-of-the-art (SOTA) NAS methods (up to 4.0% ImageNet top1 accuracy improvement over MobileNetV3, or same accuracy but 1.5× faster than MobileNetV3, 2.6× faster than EfficientNet w.r.t measured latency) while reducing many orders of magnitude GPU hours and CO2 emission. In particular, OFA achieves a new SOTA 80.0% ImageNet top-1 accuracy under the mobile setting (\<600M MACs). OFA is the winning solution for the 3rd Low Power Computer Vision Challenge (LPCVC), DSP classification track and the 4th LPCVC, both classification track and detection track.
 
-## Introduction
+## Get Started
 
 We product inference models which are published by official Once-For-All repo and converted by MMRazor.
 
 ### Subnet test on ImageNet
 
 ```bash
-sh tools/slurm_test.sh $PARTITION $JOB_NAME \
+CUDA_VISIBLE_DEVICES=0 PORT=29500 ./tools/dist_test.sh \
   configs/nas/mmcls/onceforall/ofa_mobilenet_subnet_8xb256_in1k.py \
-  $STEP2_CKPT --work-dir $WORK_DIR --eval accuracy
+  $OFA_CKPT 1 --work-dir $WORK_DIR
 ```
 
 ## Results and models

diff --git a/configs/nas/mmcls/spos/README.md b/configs/nas/mmcls/spos/README.md
@@ -11,32 +11,39 @@ Comprehensive experiments verify that our approach is flexible and effective. It
 
 ![pipeline](https://user-images.githubusercontent.com/88702197/187424862-c2f3fde1-4a48-4eda-9ff7-c65971b683ba.jpg)
 
-## Introduction
+## Get Started
 
-### Supernet pre-training on ImageNet
+### Step 1: Supernet pre-training on ImageNet
 
 ```bash
-python ./tools/mmcls/train_mmcls.py \
-  configs/nas/spos/spos_supernet_shufflenetv2_8xb128_in1k.py \
-  --work-dir $WORK_DIR
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/spos/spos_supernet_shufflenetv2_8xb128_in1k.py 4 \
+  --work-dir $WORK_DIR \
+```
+
+### Step 2: Search for subnet on the trained supernet
+
+```bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/spos/spos_evolution_search_shufflenetv2_8xb2048_in1k.py 4 \
+  --work-dir $WORK_DIR --cfg-options load_from=$STEP1_CKPT
 ```
 
-### Search for subnet on the trained supernet
+### Step 3: Subnet retraining on ImageNet
 
 ```bash
-python ./tools/mmcls/search_mmcls.py \
-  configs/nas/spos/spos_evolution_search_shufflenetv2_8xb2048_in1k.py \
-  $STEP1_CKPT \
-  --work-dir $WORK_DIR
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/spos/spos_subnet_shufflenetv2_8xb128_in1k.py 4 \
+  --work-dir $WORK_DIR --cfg-options algorithm.mutable_cfg=$STEP2_SUBNET_YAML  # or modify the config directly
 ```
 
-### Subnet retraining on ImageNet
+## Step 4: Subnet inference on ImageNet
 
 ```bash
-python ./tools/mmcls/train_mmcls.py \
+CUDA_VISIBLE_DEVICES=0 PORT=29500 ./tools/dist_test.sh \
   configs/nas/spos/spos_subnet_shufflenetv2_8xb128_in1k.py \
-  --work-dir $WORK_DIR \
-  --cfg-options algorithm.mutable_cfg=$STEP2_SUBNET_YAML  # or modify the config directly
+  $SEARCHED_CKPT 1 --work-dir $WORK_DIR \
+  --cfg-options algorithm.mutable_cfg=$STEP2_SUBNET_YAML
 ```
 
 ## Results and models

diff --git a/configs/nas/mmdet/detnas/README.md b/configs/nas/mmdet/detnas/README.md
@@ -10,48 +10,45 @@ Object detectors are usually equipped with backbone networks designed for image
 
 ![pipeline](https://user-images.githubusercontent.com/88702197/187425296-64baa22a-9422-46cd-bd95-47e3e5707f75.jpg)
 
-## Introduction
+## Get Started
 
 ### Step 1: Supernet pre-training on ImageNet
 
 ```bash
-python ./tools/mmcls/train_mmcls.py \
-  configs/nas/detnas/detnas_supernet_shufflenetv2_8xb128_in1k.py \
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/detnas/detnas_supernet_shufflenetv2_8xb128_in1k.py 4 \
   --work-dir $WORK_DIR
 ```
 
 ### Step 2: Supernet fine-tuning on COCO
 
 ```bash
-python ./tools/mmdet/train_mmdet.py \
-  configs/nas/detnas/detnas_supernet_frcnn_shufflenetv2_fpn_1x_coco.py \
-  --work-dir $WORK_DIR \
-  --cfg-options load_from=$STEP1_CKPT
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/detnas/detnas_supernet_frcnn_shufflenetv2_fpn_1x_coco.py 4 \
+  --work-dir $WORK_DIR --cfg-options load_from=$STEP1_CKPT
 ```
 
 ### Step 3: Search for subnet on the trained supernet
 
-```
-python ./tools/mmdet/search_mmdet.py \
-  configs/nas/detnas/detnas_evolution_search_frcnn_shufflenetv2_fpn_coco.py \
-  $STEP2_CKPT \
-  --work-dir $WORK_DIR
+```bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/detnas/detnas_evolution_search_frcnn_shufflenetv2_fpn_coco.py 4 \
+  --work-dir $WORK_DIR --cfg-options load_from=$STEP2_CKPT
 ```
 
 ### Step 4: Subnet retraining on ImageNet
 
-```
-python ./tools/mmcls/train_mmcls.py \
-  configs/nas/detnas/detnas_subnet_shufflenetv2_8xb128_in1k.py \
-  --work-dir $WORK_DIR \
-  --cfg-options algorithm.mutable_cfg=$STEP3_SUBNET_YAML  # or modify the config directly
+```bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/detnas/detnas_subnet_shufflenetv2_8xb128_in1k.py 4 \
+  --work-dir $WORK_DIR --cfg-options algorithm.mutable_cfg=$STEP3_SUBNET_YAML  # or modify the config directly
 ```
 
 ### Step 5: Subnet fine-tuning on COCO
 
-```
-python ./tools/mmdet/train_mmdet.py \
-  configs/nas/detnas/detnas_subnet_frcnn_shufflenetv2_fpn_1x_coco.py \
+```bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/nas/detnas/detnas_subnet_frcnn_shufflenetv2_fpn_1x_coco.py 4 \
   --work-dir $WORK_DIR \
   --cfg-options algorithm.mutable_cfg=$STEP3_SUBNET_YAML load_from=$STEP4_CKPT  # or modify the config directly
 ```

diff --git a/configs/pruning/mmcls/dcff/README.md b/configs/pruning/mmcls/dcff/README.md
@@ -43,7 +43,7 @@ The mainstream approach for filter pruning is usually either to force a hard-cod
 }
 ```
 
-## Getting Started
+## Get Started
 
 ### Generate channel_config file
 
@@ -64,9 +64,9 @@ Then set layers' pruning rates `target_pruning_ratio` by `resnet_cls.json`.
 ##### ImageNet
 
 ```bash
-sh tools/slurm_train.sh $PARTITION $JOB_NAME \
-  configs/pruning/mmcls/dcff/dcff_resnet50_8xb32_in1k.py \
-  $WORK_DIR
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
+  configs/pruning/mmcls/dcff/dcff_resnet50_8xb32_in1k.py 4 \
+  --work-dir $WORK_DIR
 ```
 
 ### Test DCFF
@@ -76,7 +76,7 @@ sh tools/slurm_train.sh $PARTITION $JOB_NAME \
 ##### ImageNet
 
 ```bash
-sh tools/slurm_test.sh $PARTITION $JOB_NAME \
+CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_test.sh \
   configs/pruning/mmcls/dcff/dcff_compact_resnet50_8xb32_in1k.py \
-  $WORK_DIR
+  $CKPT 1 --work-dir $WORK_DIR
 ```