-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: FSDP for TPUs #422
Comments
Once the next release of PyTorch XLA is out, we'll start taking a look at this |
Hey @muellerzr, is there ongoing work for adding XLA support to FSDP? We, on the AWS SageMaker training compiler side, have started looking into XLA-FSDP and might be able to contribute to adding such support to accelerate. |
@Vatshank not yet! It's the next thing on my list to get to after TPU pod support, so would love the help if you guys can! 🙏 |
Okay cool @muellerzr! Although our focus is on GPUs, I am sure there will be significant overlap in the code for adding support for either device type. What do you think would be a good way to discuss some of these implementation details? If you guys have a shared Slack group for development, for instance. Also happy to continue to bug you on GitHub, if that's preferred :) |
@Vatshank this gh issue should be fine! |
@AlexWertheim With your recent pr can we call this request done? |
Yeah, I think so. For reference, the PR in question can be seen here. @muellerzr can say better than I can whether this fulfills all requirements where accelerate is concerned. |
A recent contribution to the pytorch_xla repo allows using FSDP in PyTorch XLA for sharding Module parameters across data-parallel workers. pytorch/xla#3431
Some motivation behind this: It may be possible perform inference with OPT 30B on Google Colab without needing a Pro subscription, which I think many people will appreciate.
What will be needed to add it to accelerate?
The text was updated successfully, but these errors were encountered: