Skip to content

Commit

Permalink
fix new_group for hang problem
Browse files Browse the repository at this point in the history
  • Loading branch information
ForFishes committed Jun 3, 2021
1 parent 901796f commit e4fa0a6
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion python/paddle/distributed/collective.py
Original file line number Diff line number Diff line change
Expand Up @@ -262,10 +262,12 @@ def new_group(ranks=None, backend=None):
place).init_with_ring_id(ring_id)
else:
assert False, ("no cuda device found")
else:
return gp

# TODO(shenliang03): This is a temporary solution to solve the problem of
# hang caused by cross-creation of new_group
tmp = paddle.to_tensor([0])
tmp = fill_constant([0], dtype="int32", value="1")
paddle.distributed.all_reduce(tmp, use_calc_stream=True)
paddle.distributed.wait(tmp)
return gp
Expand Down

1 comment on commit e4fa0a6

@paddle-bot-old
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congratulation! Your pull request passed all required CI. You could ask reviewer(s) to approve and merge. 🎉

Please sign in to comment.