EfficientNet and ResNeXt101_wsl series

Overview

EfficientNet is a lightweight NAS-based network released by Google in 2019. EfficientNetB7 refreshed the classification accuracy of ImageNet-1k at that time. In this paper, the author points out that the traditional methods to improve the performance of neural networks mainly start with the width of the network, the depth of the network, and the resolution of the input picture. However, the author found that balancing these three dimensions is essential for improving accuracy and efficiency through experiments. Therefore, the author summarized how to balance the three dimensions at the same time through a series of experiments. At the same time, based on this scaling method, the author built a total of 7 networks B1-B7 in the EfficientNet series on the basis of EfficientNetB0, and with the same FLOPS and parameters, the accuracy reached state-of-the-art effect.

ResNeXt is an improved version of ResNet that proposed by Facebook in 2016. In 2019, Facebook researchers studied the accuracy limit of the series network on ImageNet through weakly-supervised-learning. In order to distinguish the previous ResNeXt network, the suffix of this series network is WSL, where WSL is the abbreviation of weakly-supervised-learning. In order to have stronger feature extraction capability, the researchers further enlarged the network width, among which the largest ResNeXt101_32x48d_wsl has 800 million parameters. It was trained under 940 million weak-labeled images, and the results were finetune trained on imagenet-1k. Finally, the acc-1 of imagenet-1k reaches 85.4%, which is also the network with the highest precision under the resolution of 224x224 on imagenet-1k so far. In Fix-ResNeXt, the author used a larger image resolution, made a special Fix strategy for the inconsistency of image data preprocessing in training and testing, and made ResNeXt101_32x48d_wsl have a higher accuracy. Since it used the Fix strategy, it was named Fix-ResNeXt101_32x48d_wsl.

The FLOPS, parameters, and inference time on the T4 GPU of this series of models are shown in the figure below.

At present, there are a total of 14 pretrained models of the two types of models that PaddleClas open source. It can be seen from the above figure that the advantages of the EfficientNet series network are very obvious. The ResNeXt101_wsl series model uses more data, and the final accuracy is also higher. EfficientNet_B0_small removes SE_block based on EfficientNet_B0, which has faster inference speed.

Accuracy, FLOPS and Parameters

Models	Top1	Top5	Reference top1	Reference top5	FLOPS (G)	Parameters (M)
ResNeXt101_ 32x8d_wsl	0.826	0.967	0.822	0.964	29.140	78.440
ResNeXt101_ 32x16d_wsl	0.842	0.973	0.842	0.972	57.550	152.660
ResNeXt101_ 32x32d_wsl	0.850	0.976	0.851	0.975	115.170	303.110
ResNeXt101_ 32x48d_wsl	0.854	0.977	0.854	0.976	173.580	456.200
Fix_ResNeXt101_ 32x48d_wsl	0.863	0.980	0.864	0.980	354.230	456.200
EfficientNetB0	0.774	0.933	0.773	0.935	0.720	5.100
EfficientNetB1	0.792	0.944	0.792	0.945	1.270	7.520
EfficientNetB2	0.799	0.947	0.803	0.950	1.850	8.810
EfficientNetB3	0.812	0.954	0.817	0.956	3.430	11.840
EfficientNetB4	0.829	0.962	0.830	0.963	8.290	18.760
EfficientNetB5	0.836	0.967	0.837	0.967	19.510	29.610
EfficientNetB6	0.840	0.969	0.842	0.968	36.270	42.000
EfficientNetB7	0.843	0.969	0.844	0.971	72.350	64.920
EfficientNetB0_ small	0.758	0.926			0.720	4.650

Inference speed based on V100 GPU

Models	Crop Size	Resize Short Size	FP32 Batch Size=1 (ms)
ResNeXt101_ 32x8d_wsl	224	256	19.127
ResNeXt101_ 32x16d_wsl	224	256	23.629
ResNeXt101_ 32x32d_wsl	224	256	40.214
ResNeXt101_ 32x48d_wsl	224	256	59.714
Fix_ResNeXt101_ 32x48d_wsl	320	320	82.431
EfficientNetB0	224	256	2.449
EfficientNetB1	240	272	3.547
EfficientNetB2	260	292	3.908
EfficientNetB3	300	332	5.145
EfficientNetB4	380	412	7.609
EfficientNetB5	456	488	12.078
EfficientNetB6	528	560	18.381
EfficientNetB7	600	632	27.817
EfficientNetB0_ small	224	256	1.692

Inference speed based on T4 GPU

Models	Crop Size	Resize Short Size	FP16 Batch Size=1 (ms)	FP16 Batch Size=4 (ms)	FP16 Batch Size=8 (ms)	FP32 Batch Size=1 (ms)	FP32 Batch Size=4 (ms)	FP32 Batch Size=8 (ms)
ResNeXt101_ 32x8d_wsl	224	256	18.19374	21.93529	34.67802	18.52528	34.25319	67.2283
ResNeXt101_ 32x16d_wsl	224	256	18.52609	36.8288	62.79947	25.60395	71.88384	137.62327
ResNeXt101_ 32x32d_wsl	224	256	33.51391	70.09682	125.81884	54.87396	160.04337	316.17718
ResNeXt101_ 32x48d_wsl	224	256	50.97681	137.60926	190.82628	99.01698256	315.91261	551.83695
Fix_ResNeXt101_ 32x48d_wsl	320	320	78.62869	191.76039	317.15436	160.0838242	595.99296	1151.47384
EfficientNetB0	224	256	3.40122	5.95851	9.10801	3.442	6.11476	9.3304
EfficientNetB1	240	272	5.25172	9.10233	14.11319	5.3322	9.41795	14.60388
EfficientNetB2	260	292	5.91052	10.5898	17.38106	6.29351	10.95702	17.75308
EfficientNetB3	300	332	7.69582	16.02548	27.4447	7.67749	16.53288	28.5939
EfficientNetB4	380	412	11.55585	29.44261	53.97363	12.15894	30.94567	57.38511
EfficientNetB5	456	488	19.63083	56.52299	-	20.48571	61.60252	-
EfficientNetB6	528	560	30.05911	-	-	32.62402	-	-
EfficientNetB7	600	632	47.86087	-	-	53.93823	-	-
EfficientNetB0_small	224	256	2.39166	4.36748	6.96002	2.3076	4.71886	7.21888

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EfficientNet_and_ResNeXt101_wsl_en.md

EfficientNet_and_ResNeXt101_wsl_en.md

EfficientNet and ResNeXt101_wsl series

Overview

Accuracy, FLOPS and Parameters

Inference speed based on V100 GPU

Inference speed based on T4 GPU

Files

EfficientNet_and_ResNeXt101_wsl_en.md

Latest commit

History

EfficientNet_and_ResNeXt101_wsl_en.md

File metadata and controls

EfficientNet and ResNeXt101_wsl series

Overview

Accuracy, FLOPS and Parameters

Inference speed based on V100 GPU

Inference speed based on T4 GPU