-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can flashattention run on Jetson AGX Orin with compute capability of 8.7? #449
Comments
NVIDIA docs said that the only difference between compute capability of 8.7 and 8.6 is the size of shared memory. A unified data cache and shared memory with a total size of 192 KB for devices of compute capability 8.7 and 128 KB for devices of compute capabilities 8.6. |
From the docs it seems like the code should just run, since 8.7 has more shared memory than 8.6. Idk the issue is, and I don't have the hardware to test or debug. You can also try running with the nvcr pytorch 23.07 container so we're sure it's not the environment that's the issue. |
Thanks for your reply! nvcr pytorch 23.07 container seems not fit NVIDIA SoCs. I found NVIDIA has provided pytorch container specifically for SoCs, but it also doesn't work for flash-attention, the error output is the same as issue #451. Maybe I will check it in the future. So I close this issue now. Anyway, thanks for your reply again! |
You can try compiling in that container with |
Running into similar problem using this container by NVIDIA on Jetson Orin AGX 64GB (compute capability 8.7) to install FlashAttention v2.1.1 Tried CUDA = 11.4 |
Got it working after following this issue |
It seems Jetson AGX Orin's compute capability is not supported by flashattention. Its compute capability is 8.7 and GPU architecture is ampere.
Can I just modify something to make it work? Thanks a lot in advance!
When I just test "flash_attn_func" in flashattention like this
The compilation result is:
The environment is:
Cuda: 11.4
Pytorch: 2.0.0
flashattention: 2.0.6
The text was updated successfully, but these errors were encountered: