-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1533 Fix distributed data parallel issue in ClassificationSaver #1535
1533 Fix distributed data parallel issue in ClassificationSaver #1535
Conversation
merge master
/black |
Signed-off-by: Nic Ma <nma@nvidia.com>
5597ad9
to
cc53588
Compare
/black |
Signed-off-by: monai-bot <monai.miccai2019@gmail.com>
Signed-off-by: Nic Ma <nma@nvidia.com>
Signed-off-by: Nic Ma <nma@nvidia.com>
/black |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned some small things only. According to the code coverage output the tests aren't being run under multi GPU with Pytorch 1.7 so string_list_all_gather
isn't being tested but it looks ok.
Signed-off-by: Nic Ma <nma@nvidia.com>
/black |
5f829f9
to
3b005bd
Compare
/black |
Signed-off-by: Nic Ma <nma@nvidia.com>
34e56d9
to
fc08308
Compare
/black |
Signed-off-by: monai-bot <monai.miccai2019@gmail.com>
@ericspod the code coverage is inaccurate in our current setting as the multiprocess executions are not tracked properly...I'll create an issue |
Fixes #1533 .
Description
This PR fixed the file saving issue of ClassificationSaver in distributed data parallel mode.
Status
Ready
Types of changes
./runtests.sh --codeformat --coverage
../runtests.sh --quick
.make html
command in thedocs/
folder.