-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
State maintenance in DP #565
Comments
@S-aiueo32 yeah, this is a limitation of PyTorch. I've been looking at how to maintain state when using DP but there seems to be no clear way... @pietern I think we talked about this a few months ago. Any suggestions on how to maintain state when using DP? |
DP replicates the source module for every call to forward. If you want to maintain state, you can't do this and rather should replicate once and then broadcast parameters and buffers from |
@williamFalcon @pietern |
actually, it should be avoidable given the explanation above. we just need to make the appropriate changes to the dp subclass |
This should be a companion class to |
Just wanted to check if there was any update/advice on this type of issue? I've got a similar situation with a GAN producing images in the first optimizer iteration then using them to update the discriminator in the second. It works well on a single GPU, but when distributing I run into the same issue. I initially thought adding the property as a buffer would maintain it, but it seems to be flushed when using DP in the same way. Is the only solution to run the generator in the discriminator's optimizer iteration? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team! |
In many image generation tasks with GANs, generator and discriminator is trained through the same generated image single iteration.
In PyTorch Lightning, the procedure is written like below:
It works well on single GPU, however,
self.foo_out
has been flushed inoptimizer_i == 1
branch when DP is set.I think it is a undesired behavior, any help or fix?
The text was updated successfully, but these errors were encountered: