-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi-gpu ddp calls validation and testing loops too many times #1161
Comments
Latest pull - 1 hour ago, no longer this behavior. Closing. |
Sorry - this issue still exists in some configurations. My proposed fix is not the total picture. Still investigating - will provide reproducible example. |
Testing underway. Will make PR tomorrow. |
Dont want to clutter up PR world if no one is interested in this. Let me know ... |
that sounds a good contribution to me... mind send a PR? |
will do on both pr, and hash ref |
When using ddp with multiple gpus, each validation and test loop is called with the entire validation dataset for each gpu.
Expected behavior is that the dataset is divided appropriately across the gpus.
I am using current master (cloned Mar 14), Ubuntu 19.10, Cuda 10.1, python 3.7.5, pytorch 1.4, venv environment.
The problem appears to be in
auto_add_sampler()
in data_loading.py. It does not create aDistributedSampler
for validation or test datasets.The text was updated successfully, but these errors were encountered: