-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Layout conversion error on H100 #4418
Labels
Comments
Apparently this is a known issue with Hopper architecture #2627 |
In case anyone else has a similar problem, I was able to successfully work around the issue by removing all |
3 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello,
My modified flash attention kernel gives me the following error when I run it on a H100 GPU, even though the kernel works fine on A100 and RTX 3060:
I've reduced the kernel down to the following minimal example, which crashes on the H100 with the same error, but runs successfully on my RTX 3060:
The H100 system is LambdaLabs Ubuntu, with these software versions installed using conda:
Thanks for the help!
The text was updated successfully, but these errors were encountered: