-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: gather_all_tensors cross GPUs in DDP #3319
Fix: gather_all_tensors cross GPUs in DDP #3319
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3319 +/- ##
======================================
Coverage 90% 90%
======================================
Files 90 90
Lines 8158 8158
======================================
Hits 7362 7362
Misses 796 796 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -301,7 +301,7 @@ def gather_all_tensors_if_available(result: Union[torch.Tensor], | |||
|
|||
world_size = torch.distributed.get_world_size(group) | |||
|
|||
gathered_result = world_size * [torch.zeros_like(result)] | |||
gathered_result = [torch.zeros_like(result) for _ in range(world_size)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test to tests/metrics/test_converters.py
that actually test that this function does what it is expected to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, thanks for the advice, I will add a test for this RP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
waiting for a test to be added :]
Hello @ShomyLiu! Thanks for updating this PR.
Comment last updated at 2020-09-03 09:34:04 UTC |
@SkafteNicki Hi, I found that the commit history has been disordered when I am going to update a test case in my PR. |
@Borda can you help fixing the commit history (you seem to be good at git :])? |
@ShomyLiu can you enable push access for maintainers? I tried to push to your branch and it was rejected |
@justusschock Hi, I have invited you into my branch, then you can push yourself. |
7e0596d
to
ba8afb6
Compare
I rebased on master. So this should be fine now I think :) |
@justusschock Yeah, thanks for your effort. It is my first PR to lightning, I'm not so familiar with this whole merge process before. |
Just waiting for test to pass, and then we can get this merged :] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM =)
What does this PR do?
This PR fixed the sharing the same underlying storage for all GPUs in
gather_all_tensors
function.Fixes #3253
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃