Adagrad not working with GPU and DDP #6824
Labels
bug
Something isn't working
distributed
Generic distributed-related topic
help wanted
Open to be worked on
priority: 0
High priority task
Milestone
🐛 Bug
Adagrad doesn't work with GPUs and DDP as the optimizer is created before the model is moved to CUDA. I believe this issue has been addressed in an earlier version: #554
How to reproduce using the BoringModel
https://colab.research.google.com/drive/1HfyL5htoOkPETggTLwYNfh94HrNc6TOS?usp=sharing
The error emerged when I tried using Adagrad with both one and multiple GPUs.
Stack trace
Environment
The text was updated successfully, but these errors were encountered: