[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer #34965

wangxicoding · 2021-08-17T08:03:20Z

PR types

Performance optimization

PR changes

Others

Describe

Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer.
Usage, set optimize_cast to True:

import paddle.distributed.fleet as fleet

strategy = fleet.DistributedStrategy()
strategy.sharding = True
strategy.sharding_configs = {
    "sharding_degree": 1,
    "mp_degree": 1,
    "pp_degree": 2,
    "dp_degree": 2,
    "optimize_cast: True,
}
strategy.pipeline = True
strategy.pipeline_configs = {
    "schedule_mode": "1F1B",
    "micro_batch_size": 2,
    "accumulate_steps": 4,
}
strategy.amp = True

Test

Test in 1node*8cards 32G V100, with Ernie3.0 model.

Model config, Ernie 3.0 base:

	value
hidden size	768
num attention heads	12
num hidden layers	15
num sharing layers	12
branch hidden size	256
branch num attention heads	4

batch size configs:

micro bsz	global bsz
2	256

Performance:

hybrid_config	optimize_cast	throughput(tokens/s)	improve
8pp	false	35849
8pp	true	37747	5.29%
2mp+2pp+2dp	false	41450
2mp+2pp+2dp	true	44543	7.46%

paddle-bot-old · 2021-08-17T08:03:23Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

sandyhouse

LGTM

gongweibao

需要验证精度对齐以及给出NPU上的性能提升数据?

JZ-LIANG

LGTM

gongweibao

LGTM

…fp32 param to fp16 param to the optimizer (PaddlePaddle#34965)

…fp32 param to fp16 param to the optimizer (#34965) (#35296) Co-authored-by: WangXi <wangxi16@baidu.com>

…ch cast fp32 param to fp16 param to the optimizer (PaddlePaddle#34965) (PaddlePaddle#35296)" This reverts commit 6fb58ae.

PaddlePaddle#35116) (PaddlePaddle#35301)" This reverts commit 2931df5. Revert "[cherry-pick][hybrid performance] optim npu coalesce set constant (PaddlePaddle#35105) (PaddlePaddle#35302)" This reverts commit 12260bd. Revert "[cherry-pick][hybrid performance] optim the grad fuse for pipeline mode by sorting the grad by dtype (PaddlePaddle#35070) (PaddlePaddle#35300)" This reverts commit e69cc21. Revert "[cherry-pick][hybrid performance] Grad fuse for gradient merge under pipeline mode (PaddlePaddle#35004) (PaddlePaddle#35299)" This reverts commit e931cd1. Revert "Add flags to control whether to check Nan value of hccl_allreduce_sum. (PaddlePaddle#35093) (PaddlePaddle#35298)" This reverts commit d4948bc. Revert "[hybrid] Fix row parallel linear bias (PaddlePaddle#35186) (PaddlePaddle#35297)" This reverts commit b36fb03. Revert "[hybrid][npu] fix npu clear float status in pipeline (PaddlePaddle#35165) (PaddlePaddle#35295)" This reverts commit 167685e. Revert "[hybrid npu] fix npu found_finite in hybrid (PaddlePaddle#35134) (PaddlePaddle#35291)" This reverts commit e64105f. Revert "[cherry-pick][Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer (PaddlePaddle#34965) (PaddlePaddle#35296)" This reverts commit 6fb58ae. Revert "[cherry-pick] NPU use squared_l2_norm in GradientClipByGlobalNorm (PaddlePaddle#34836) (PaddlePaddle#35289)" This reverts commit 38c27d5.

remove fp32param cast in hybird

4171aa6

wangxicoding added 3 commits August 17, 2021 16:24

add test

d037129

rename remove_param_cast to optimize_cast

4173793

fix

2945954

wangxicoding changed the title ~~[hybrid] remove fp32 param cast~~ [Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to optimizer Aug 17, 2021

wangxicoding changed the title ~~[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to optimizer~~ [Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer Aug 17, 2021

fix coverage

8ca34b1

wangxicoding requested review from JZ-LIANG, gongweibao, sandyhouse and fuyinno4 August 18, 2021 02:17

sandyhouse approved these changes Aug 18, 2021

View reviewed changes

fuyinno4 approved these changes Aug 18, 2021

View reviewed changes

gongweibao requested changes Aug 18, 2021

View reviewed changes

JZ-LIANG approved these changes Aug 18, 2021

View reviewed changes

gongweibao approved these changes Aug 18, 2021

View reviewed changes

gongweibao merged commit a9673b4 into PaddlePaddle:develop Aug 18, 2021

wangxicoding deleted the hybird_remove_fp32param_cast branch August 18, 2021 11:31

FeixLiu pushed a commit to FeixLiu/Paddle that referenced this pull request Aug 31, 2021

[cherry-pick][Hybrid Performance] Move the cast op of AMP which cast …

7660840

…fp32 param to fp16 param to the optimizer (PaddlePaddle#34965)

wangxicoding added a commit that referenced this pull request Aug 31, 2021

[cherry-pick][Hybrid Performance] Move the cast op of AMP which cast …

6fb58ae

…fp32 param to fp16 param to the optimizer (#34965) (#35296) Co-authored-by: WangXi <wangxi16@baidu.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer #34965

[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer #34965

wangxicoding commented Aug 17, 2021 •

edited by gongweibao

Loading

paddle-bot-old bot commented Aug 17, 2021

sandyhouse left a comment

gongweibao left a comment •

edited

Loading

JZ-LIANG left a comment

gongweibao left a comment

[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer #34965

[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer #34965

Conversation

wangxicoding commented Aug 17, 2021 • edited by gongweibao Loading

PR types

PR changes

Describe

Test

paddle-bot-old bot commented Aug 17, 2021

sandyhouse left a comment

Choose a reason for hiding this comment

gongweibao left a comment • edited Loading

Choose a reason for hiding this comment

JZ-LIANG left a comment

Choose a reason for hiding this comment

gongweibao left a comment

Choose a reason for hiding this comment

wangxicoding commented Aug 17, 2021 •

edited by gongweibao

Loading

gongweibao left a comment •

edited

Loading