Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HybridParallel]Fix bug of p2p for partial_send/recv #34615

Merged
merged 2 commits into from
Aug 5, 2021

Conversation

ForFishes
Copy link
Member

@ForFishes ForFishes commented Aug 4, 2021

PR types

Bug fixes

PR changes

Others

Describe

[HybridParallel]Fix bug of p2p for partial_send/recv.
解决以下两个问题:
1、卡卡之间允许传递不被mp_degree整除的tensor。
2、卡卡之间传递stop_gradient信息。

Copy link
Contributor

@wangxicoding wangxicoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

'ring_id', ring_id, 'peer', dst, 'num',
nranks, 'id', rank_id)
else:
return paddle.distributed.send(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以设置nranks=1, rank_id=0,还是用partial_send就行

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,两个都可以,但是都判断了,还是调用原生的send,我觉得。😊

'id', rank_id, 'dtype', tensor.dtype, 'out_shape',
tensor.shape)
else:
paddle.distributed.recv(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

@ForFishes ForFishes merged commit 4cc3d9a into PaddlePaddle:develop Aug 5, 2021
@ForFishes ForFishes deleted the fix_pp_send_recv branch August 5, 2021 02:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants