mindspore-lab · CaitinZhao · Nov 22, 2024 · Sep 3, 2024 · Oct 14, 2024 · Nov 12, 2024
diff --git a/README.md b/README.md
@@ -29,13 +29,13 @@ MindCV is an open-source toolbox for computer vision research and development ba
 
 The following is the corresponding `mindcv` versions and supported `mindspore` versions.
 
-| mindcv | mindspore  |
-|:------:|:----------:|
-|  main  |   master   |
-| v0.4.0 |   2.3.0    |
-| 0.3.0  |   2.2.10   |
-|  0.2   |    2.0     |
-|  0.1   |    1.8     |
+| mindcv |  mindspore  |
+| :----: | :---------: |
+|  main  |   master    |
+| v0.4.0 | 2.3.0/2.3.1 |
+| 0.3.0  |   2.2.10    |
+|  0.2   |     2.0     |
+|  0.1   |     1.8     |
 
 
 ### Major Features

diff --git a/benchmark_results.md b/benchmark_results.md
diff --git a/configs/README.md b/configs/README.md
@@ -31,24 +31,24 @@ Please follow the outline structure and **table format** shown in [densenet/READ
 
 #### Table Format
 
-<div align="center">
 
-| model       | top-1 (%) | top-5 (%) | params (M) | batch size | cards | ms/step | jit_level | recipe                                                                                              | download                                                                                                  |
-| ----------- | --------- | --------- | ---------- | ---------- | ----- | ------- | --------- | --------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
-| densenet121 | 75.67     | 92.77     | 8.06       | 32         | 8     | 47,34   | O2        | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/densenet/densenet_121_ascend.yaml) | [weights](https://download-mindspore.osinfra.cn/toolkits/mindcv/densenet/densenet121-bf4ab27f-910v2.ckpt) |
 
-</div>
+| model name  | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s   | acc@top1 | acc@top5 | recipe                                                                                              | weight                                                                                                    |
+| ----------- | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | --------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
+| densenet121 | 8.06      | 8     | 32         | 224x224    | O2        | 300s          | 47,34   | 5446.81 | 75.67    | 92.77    | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/densenet/densenet_121_ascend.yaml) | [weights](https://download-mindspore.osinfra.cn/toolkits/mindcv/densenet/densenet121-bf4ab27f-910v2.ckpt) |
+
+
 
 Illustration:
-- Model: model name in lower case with _ seperator.
-- Top-1 and Top-5: Accuracy reported on the validatoin set of ImageNet-1K. Keep 2 digits after the decimal point.
-- Params (M): # of model parameters in millions (10^6). Keep **2 digits** after the decimal point
-- Batch Size: Training batch size
-- Cards: # of cards
-- Ms/step: Time used on training per step in ms
-- Jit_level: Jit level of mindspore context, which contains 3 levels: O0/O1/O2
-- Recipe: Training recipe/configuration linked to a yaml config file.
-- Download: url of the pretrained model weights
+- model name: model name in lower case with _ seperator.
+- top-1 and top-5: Accuracy reported on the validatoin set of ImageNet-1K. Keep 2 digits after the decimal point.
+- params(M): # of model parameters in millions (10^6). Keep **2 digits** after the decimal point
+- batch size: Training batch size
+- cards: # of cards
+- ms/step: Time used on training per step in ms
+- jit level: Jit level of mindspore context, which contains 3 levels: O0/O1/O2
+- recipe: Training recipe/configuration linked to a yaml config file.
+- weight: url of the pretrained model weights
 
 ### Model Checkpoint Format
  The checkpoint (i.e., model weight) name should follow this format:  **{model_name}_{specification}-{sha256sum}.ckpt**, e.g., `poolformer_s12-5be5c4e4.ckpt`.

diff --git a/configs/bit/README.md b/configs/bit/README.md
@@ -2,6 +2,7 @@
 
 > [Big Transfer (BiT): General Visual Representation Learning](https://arxiv.org/abs/1912.11370)
 
+
 ## Introduction
 
 Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision.
@@ -12,30 +13,10 @@ is required. 3) Long pre-training time: Pretraining on a larger dataset requires
 BiT use GroupNorm combined with Weight Standardisation instead of BatchNorm. Since BatchNorm performs worse when the number of images on each accelerator is
 too low. 5) With BiT fine-tuning, good performance can be achieved even if there are only a few examples of each type on natural images.[[1, 2](#References)]
 
-
-## Results
-
-Our reproduced model performance on ImageNet-1K is reported as follows.
-
-- ascend 910* with graph mode
-
-*coming soon*
-
-- ascend 910 with graph mode
-
-
-<div align="center">
-
-
-| model        | top-1 (%) | top-5 (%) | params(M) | batch size | cards | ms/step | jit_level | recipe                                                                                         | download                                                                                |
-| ------------ | --------- | --------- | --------- | ---------- | ----- |---------| --------- | ---------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
-| bit_resnet50 | 76.81     | 93.17     | 25.55     | 32         | 8     | 74.52   | O2        | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/bit/bit_resnet50_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/bit/BiT_resnet50-1e4795a4.ckpt) |
-
-
-</div>
-
-#### Notes
-- Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.
+## Requirements
+| mindspore | ascend driver |  firmware   | cann toolkit/kernel |
+| :-------: | :-----------: | :---------: | :-----------------: |
+|   2.3.1   |   24.1.RC2    | 7.3.0.1.231 |    8.0.RC2.beta1    |
 
 ## Quick Start
 
@@ -82,6 +63,26 @@ To validate the accuracy of the trained model, you can use `validate.py` and par
 python validate.py -c configs/bit/bit_resnet50_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
 ```
 
+## Performance
+
+Our reproduced model performance on ImageNet-1K is reported as follows.
+
+Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
+
+*coming soon*
+
+Experiments are tested on ascend 910 with mindspore 2.3.1 graph mode.
+
+
+| model name   | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s   | acc@top1 | acc@top5 | recipe                                                                                         | weight                                                                                  |
+| ------------ | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | ---------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
+| bit_resnet50 | 25.55     | 8     | 32         | 224x224    | O2        | 146s          | 74.52   | 3413.33 | 76.81    | 93.17    | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/bit/bit_resnet50_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/bit/BiT_resnet50-1e4795a4.ckpt) |
+
+
+
+### Notes
+- top-1 and top-5: Accuracy reported on the validation set of ImageNet-1K.
+
 ## References
 
 <!--- Guideline: Citation format should follow GB/T 7714. -->

diff --git a/configs/cmt/README.md b/configs/cmt/README.md
@@ -2,37 +2,20 @@
 
 > [CMT: Convolutional Neural Networks Meet Vision Transformers](https://arxiv.org/abs/2107.06263)
 
+
 ## Introduction
 
 CMT is a method to make full use of the advantages of CNN and transformers so that the model could capture long-range
 dependencies and extract local information. In addition, to reduce computation cost, this method use lightweight MHSA(multi-head self-attention)
 and depthwise convolution and pointwise convolution like MobileNet. By combing these parts, CMT could get a SOTA performance
 on ImageNet-1K dataset.
 
-
-## Results
-
-Our reproduced model performance on ImageNet-1K is reported as follows.
-
-- ascend 910* with graph mode
-
-*coming soon*
-
-- ascend 910 with graph mode
-
-<div align="center">
-
-
-| model     | top-1 (%) | top-5 (%) | params(M) | batch size | cards | ms/step | jit_level | recipe                                                                                      | download                                                                             |
-| --------- | --------- | --------- | --------- | ---------- | ----- |---------| --------- | ------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
-| cmt_small | 83.24     | 96.41     | 26.09     | 128        | 8     | 500.64  | O2        | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/cmt/cmt_small_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/cmt/cmt_small-6858ee22.ckpt) |
+## Requirements
+| mindspore | ascend driver |  firmware   | cann toolkit/kernel |
+| :-------: | :-----------: | :---------: | :-----------------: |
+|   2.3.1   |   24.1.RC2    | 7.3.0.1.231 |    8.0.RC2.beta1    |
 
 
-</div>
-
-#### Notes
-- Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.
-
 ## Quick Start
 
 ### Preparation
@@ -78,6 +61,23 @@ To validate the accuracy of the trained model, you can use `validate.py` and par
 python validate.py -c configs/cmt/cmt_small_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
 ```
 
+## Performance
+
+Our reproduced model performance on ImageNet-1K is reported as follows.
+
+Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
+
+*coming soon*
+
+Experiments are tested on ascend 910 with mindspore 2.3.1 graph mode.
+
+| model name | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s   | acc@top1 | acc@top5 | recipe                                                                                      | weight                                                                               |
+| ---------- | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | ------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
+| cmt_small  | 26.09     | 8     | 128        | 224x224    | O2        | 1268s         | 500.64  | 2048.01 | 83.24    | 96.41    | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/cmt/cmt_small_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/cmt/cmt_small-6858ee22.ckpt) |
+
+### Notes
+- top-1 and top-5: Accuracy reported on the validation set of ImageNet-1K.
+
 ## References
 
 <!--- Guideline: Citation format should follow GB/T 7714. -->

diff --git a/configs/coat/README.md b/configs/coat/README.md
@@ -6,28 +6,11 @@
 
 Co-Scale Conv-Attentional Image Transformer (CoaT) is a Transformer-based image classifier equipped with co-scale and conv-attentional mechanisms. First, the co-scale mechanism maintains the integrity of Transformers' encoder branches at individual scales, while allowing representations learned at different scales to effectively communicate with each other. Second, the conv-attentional mechanism is designed by realizing a relative position embedding formulation in the factorized attention module with an efficient convolution-like implementation. CoaT empowers image Transformers with enriched multi-scale and contextual modeling capabilities.
 
-## Results
+## Requirements
+| mindspore | ascend driver |  firmware   | cann toolkit/kernel |
+| :-------: | :-----------: | :---------: | :-----------------: |
+|   2.3.1   |   24.1.RC2    | 7.3.0.1.231 |    8.0.RC2.beta1    |
 
-Our reproduced model performance on ImageNet-1K is reported as follows.
-
-- ascend 910* with graph mode
-
-*coming soon*
-
-
-- ascend 910 with graph mode
-
-<div align="center">
-
-
-| model     | top-1 (%) | top-5 (%) | params (M) | batch size | cards | ms/step | jit_level | recipe                                                                                       | Weight                                                                                |
-| --------- | --------- | --------- | ---------- | ---------- | ----- |---------| --------- | -------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
-| coat_tiny | 79.67     | 94.88     | 5.50       | 32         | 8     | 254.95  | O2        | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/coat/coat_tiny_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/coat/coat_tiny-071cb792.ckpt) |
-
-</div>
-
-#### Notes
-- Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.
 
 
 ## Quick Start
@@ -74,6 +57,30 @@ To validate the accuracy of the trained model, you can use `validate.py` and par
 python validate.py -c configs/coat/coat_lite_tiny_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
 ```
 
+## Performance
+
+Our reproduced model performance on ImageNet-1K is reported as follows.
+
+Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
+
+*coming soon*
+
+
+Experiments are tested on ascend 910 with mindspore 2.3.1 graph mode.
+
+
+
+
+| model name | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s   | acc@top1 | acc@top5 | recipe                                                                                       | weight                                                                                |
+| ---------- | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | -------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
+| coat_tiny  | 5.50      | 8     | 32         | 224x224    | O2        | 543s          | 254.95  | 1003.92 | 79.67    | 94.88    | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/coat/coat_tiny_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/coat/coat_tiny-071cb792.ckpt) |
+
+
+
+### Notes
+- top-1 and top-5: Accuracy reported on the validation set of ImageNet-1K.
+
+
 ## References
 
 [1] Han D, Yun S, Heo B, et al. Rethinking channel dimensions for efficient model design[C]//Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 2021: 732-741.
diff --git a/configs/convit/README.md b/configs/convit/README.md
@@ -1,6 +1,7 @@
 # ConViT
 > [ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases](https://arxiv.org/abs/2103.10697)
 
+
 ## Introduction
 
 ConViT combines the strengths of convolutional architectures and Vision Transformers (ViTs).
@@ -19,36 +20,12 @@ while offering a much improved sample efficiency.[[1](#references)]
   <em>Figure 1. Architecture of ConViT [<a href="#references">1</a>] </em>
 </p>
 
-
-## Results
-
-Our reproduced model performance on ImageNet-1K is reported as follows.
-
-- ascend 910* with graph mode
-
-
-<div align="center">
-
-
-| model       | top-1 (%) | top-5 (%) | params (M) | batch size | cards | ms/step | jit_level | recipe                                                                                           | download                                                                                                |
-| ----------- | --------- | --------- | ---------- | ---------- | ----- | ------- | --------- | ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------- |
-| convit_tiny | 73.79     | 91.70     | 5.71       | 256        | 8     | 226.51  | O2        | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/convit/convit_tiny_ascend.yaml) | [weights](https://download-mindspore.osinfra.cn/toolkits/mindcv/convit/convit_tiny-1961717e-910v2.ckpt) |
-
-</div>
-
-- ascend 910 with graph mode
-
-<div align="center">
+## Requirements
+| mindspore | ascend driver |  firmware   | cann toolkit/kernel |
+| :-------: | :-----------: | :---------: | :-----------------: |
+|   2.3.1   |   24.1.RC2    | 7.3.0.1.231 |    8.0.RC2.beta1    |
 
 
-| model       | top-1 (%) | top-5 (%) | params (M) | batch size | cards | ms/step | jit_level | recipe                                                                                           | download                                                                                  |
-| ----------- | --------- | --------- | ---------- | ---------- | ----- | ------- | --------- | ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------- |
-| convit_tiny | 73.66     | 91.72     | 5.71       | 256        | 8     | 231.62  | O2        | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/convit/convit_tiny_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/convit/convit_tiny-e31023f2.ckpt) |
-
-</div>
-
-#### Notes
-- Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.
 
 ## Quick Start
 
@@ -93,6 +70,26 @@ To validate the accuracy of the trained model, you can use `validate.py` and par
 python validate.py -c configs/convit/convit_tiny_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
 ```
 
+## Performance
+
+Our reproduced model performance on ImageNet-1K is reported as follows.
+
+Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
+
+| model name  | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s   | acc@top1 | acc@top5 | recipe                                                                                           | weight                                                                                                  |
+| ----------- | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------- |
+| convit_tiny | 5.71      | 8     | 256        | 224x224    | O2        | 153s          | 226.51  | 9022.03 | 73.79    | 91.70    | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/convit/convit_tiny_ascend.yaml) | [weights](https://download-mindspore.osinfra.cn/toolkits/mindcv/convit/convit_tiny-1961717e-910v2.ckpt) |
+
+Experiments are tested on ascend 910 with mindspore 2.3.1 graph mode.
+
+| model name  | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s   | acc@top1 | acc@top5 | recipe                                                                                           | weight                                                                                    |
+| ----------- | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------- |
+| convit_tiny | 5.71      | 8     | 256        | 224x224    | O2        | 133s          | 231.62  | 8827.59 | 73.66    | 91.72    | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/convit/convit_tiny_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/convit/convit_tiny-e31023f2.ckpt) |
+
+
+### Notes
+- top-1 and top-5: Accuracy reported on the validation set of ImageNet-1K.
+
 ## References
 
 <!--- Guideline: Citation format should follow GB/T 7714. -->