-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[1/2] Collaborative Strategy #12842
[1/2] Collaborative Strategy #12842
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an awesome addition. can't wait for the full docs!
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Huge thanks @awaelchli @justusschock for going through and reviewing, really appreciate it ❤️ Two things I need to address:
|
IMO we shouldn't do that. It has one single device per process, but in total allows training on many devices. That way it behaves like an elastic multi-node DDP if the processes are spawned externally.
I think, I'd still mark it as experimental. This gives us more freedom to change stuff (like the per-node DDP thingy you mentioned). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some small comments 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still reviewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome 🚀
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gatekeeper/pass: @SeanNaren
What does this PR do?
Related #12647
This PR represents the code portion of the strategy. The next PR will be the documentation.
Does your PR introduce any breaking changes? If yes, please list them.
None
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃
cc @Borda @akihironitta @justusschock