Support auto_fp16 using torch.cuda.amp when PyTorch >= 1.6.0 #951

ycxioooong · 2021-04-15T13:50:25Z

This PR enables PyTorch official implementation for automatic mixed-precision training.
It replaces the original Pull Request due to some accident.

codecov · 2021-04-15T14:13:49Z

Codecov Report

Merging #951 (3bbca69) into master (54ece10) will decrease coverage by 0.66%.
The diff coverage is 14.91%.

@@            Coverage Diff             @@
##           master     #951      +/-   ##
==========================================
- Coverage   65.64%   64.97%   -0.67%     
==========================================
  Files         149      151       +2     
  Lines        9455     9674     +219     
  Branches     1722     1755      +33     
==========================================
+ Hits         6207     6286      +79     
- Misses       2928     3062     +134     
- Partials      320      326       +6

Flag	Coverage Δ
unittests	`64.97% <14.91%> (-0.67%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmcv/runner/hooks/optimizer.py	`17.74% <10.20%> (-6.01%)`	⬇️
mmcv/runner/fp16_utils.py	`59.73% <43.75%> (-1.59%)`	⬇️
mmcv/runner/base_module.py	`78.78% <0.00%> (-0.63%)`	⬇️
mmcv/utils/registry.py	`98.31% <0.00%> (-0.02%)`	⬇️
mmcv/ops/__init__.py	`100.00% <0.00%> (ø)`
mmcv/ops/box_iou_rotated.py	`100.00% <0.00%> (ø)`
mmcv/cnn/bricks/transformer.py	`0.00% <0.00%> (ø)`
mmcv/ops/fused_bias_leakyrelu.py	`30.90% <0.00%> (ø)`
mmcv/ops/roi_align_rotated.py	`75.38% <0.00%> (ø)`
mmcv/ops/multi_scale_deform_attn.py	`76.66% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 54ece10...3bbca69. Read the comment docs.

mmcv/runner/fp16_utils.py

mmcv/runner/hooks/optimizer.py

mmcv/runner/fp16_utils.py

ZwwWayne · 2021-04-22T06:19:05Z

Benchmark of MMDetection3D with PyTorch 1.8:

Backbone	Lr schd	FP32 Mem (GB)	FP16 Mem (GB)	FP32 mAP	FP32 NDS	FP16 mAP	FP16 NDS
FPN	2x	16.4	8.37	40.0	53.3	39.27	52.95
RegNet-400MF-FPN	2x	17.3	8.37	44.8	56.4	45.22	56.81

Benchmark of MMDetection3D with PyTorch 1.5:

Backbone	Lr schd	FP32 Mem (GB)	FP16 Mem (GB)	FP32 mAP	FP32 NDS	FP16 mAP	FP16 NDS
FPN	2x	16.4	8.40	40.0	53.3	39.06	52.71
RegNet-400MF-FPN	2x	17.3	8.41	44.8	56.4	44.38	56.59

Benchmark of MMDetection with PyTorch 1.8:

Architecture	Backbone	Style	Lr schd	Mem (GB)	Inf time (fps)	box AP	mask AP	FP16 box AP	FP16 mask AP
Faster R-CNN	R-50	pytorch	1x	2.8	-	37.4	-	37.3	-
Mask R-CNN	R-50	pytorch	1x	3.6	-	38.2	34.7	38.2	34.6
Retinanet	R-50	pytorch	1x	2.7	-	36.7	-	36.5	-

Benchmark of MMDetection with PyTorch 1.5:

Architecture	Backbone	Style	Lr schd	Mem (GB)	Inf time (fps)	box AP	mask AP	FP16 box AP	FP16 mask AP
Faster R-CNN	R-50	pytorch	1x	3.1	-	37.4	-	37.3	-
Mask R-CNN	R-50	pytorch	1x	3.1	-	38.2	34.7	38.3	34.7
Retinanet	R-50	pytorch	1x	2.76	-	36.7	-	36.2	-

mmcv/runner/fp16_utils.py

xvjiarui

Verified in MMSeg

ZwwWayne · 2021-04-27T05:46:56Z

mmcv/runner/hooks/optimizer.py

-        """Copy updated params from fp32 weight copy to fp16 model."""
-        for fp16_param, fp32_param in zip(fp16_net.parameters(), fp32_weights):
-            fp16_param.data.copy_(fp32_param.data)
+if TORCH_VERSION != 'parrots' and TORCH_VERSION >= '1.6.0':


Last comment: The indentation is a little bit annoying. Can we create two hooks first, namely PT16Fp16OptimizerHook and PT15Fp16OptimizerHook, then assign one of them to Fp16OptimizerHook according to the version?

add torch.cuda.amp to fp16_utils and optimizers

9c27786

ycxioooong requested review from hellock and ZwwWayne April 15, 2021 13:50

ycxioooong requested a review from zhouzaida April 16, 2021 11:03

zhouzaida reviewed Apr 17, 2021

View reviewed changes

mmcv/runner/fp16_utils.py Show resolved Hide resolved

ZwwWayne reviewed Apr 22, 2021

View reviewed changes

nbei reviewed Apr 22, 2021

View reviewed changes

mmcv/runner/fp16_utils.py Show resolved Hide resolved

ZwwWayne self-assigned this Apr 24, 2021

ycxioooong added 3 commits April 26, 2021 16:19

use with context manager for autocast

70e75e6

add doc to explain the behavior differences between real amp and ours

d27eec7

fix docstring

3bbca69

ZwwWayne approved these changes Apr 26, 2021

View reviewed changes

xvjiarui approved these changes Apr 26, 2021

View reviewed changes

ZwwWayne reviewed Apr 27, 2021

View reviewed changes

ZwwWayne mentioned this pull request Apr 27, 2021

[Refactor]: Create PT16Fp16OptimizerHook and PT15Fp16OptimizerHook and assign one of them to Fp16OptimizerHook according to the version #996

Open

ZwwWayne merged commit fffc875 into open-mmlab:master Apr 27, 2021

AronLin mentioned this pull request May 17, 2021

Add fp16 unit tests to the CUDA ops when pytorch >= 1.6 #1028

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support auto_fp16 using torch.cuda.amp when PyTorch >= 1.6.0 #951

Support auto_fp16 using torch.cuda.amp when PyTorch >= 1.6.0 #951

ycxioooong commented Apr 15, 2021

codecov bot commented Apr 15, 2021 •

edited

Loading

ZwwWayne commented Apr 22, 2021 •

edited

Loading

xvjiarui left a comment

ZwwWayne Apr 27, 2021

Support auto_fp16 using torch.cuda.amp when PyTorch >= 1.6.0 #951

Support auto_fp16 using torch.cuda.amp when PyTorch >= 1.6.0 #951

Conversation

ycxioooong commented Apr 15, 2021

codecov bot commented Apr 15, 2021 • edited Loading

Codecov Report

ZwwWayne commented Apr 22, 2021 • edited Loading

xvjiarui left a comment

Choose a reason for hiding this comment

ZwwWayne Apr 27, 2021

Choose a reason for hiding this comment

codecov bot commented Apr 15, 2021 •

edited

Loading

ZwwWayne commented Apr 22, 2021 •

edited

Loading