-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FP8 mixed precision via nvidia's Transformer Engine #17172
Comments
There is one more thing. The user needs to replace their layers with the custom ones from the library. What's the plan here? Will the plugin implement the |
Yes, we'll need to implement a replacement mechanism. The plugin can have a flag to disable it if necessary This also means that we'll have it in Fabric first, as these APIs do not exist in the trainer yet. |
Actually |
Any update on support for this? |
@nanand2 Our access to H100s is very limited so we haven't merged this yet. However, the branch https://github.com/Lightning-AI/lightning/tree/carmocca/transformer-engine should be usable if you want to play with it right now |
Great, thanks! |
Description & Motivation
Support https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/index.html
Pitch
Write a precision plugin using the library above that is enabled via:
precision="transformer-engine"
Alternatives
Don't implement this until it's vendored by PyTorch, if that ever happens.
Additional context
No response
cc @Borda @carmocca @justusschock @awaelchli
The text was updated successfully, but these errors were encountered: