Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xformers binaries do not include arch code for H100s (SM90) #784

Closed
Skylion007 opened this issue Jul 10, 2023 · 2 comments
Closed

Xformers binaries do not include arch code for H100s (SM90) #784

Skylion007 opened this issue Jul 10, 2023 · 2 comments

Comments

@Skylion007
Copy link
Contributor

🐛 Bug

Command

To Reproduce

Steps to reproduce the behavior:

  1. Try xformers on H100s. This is supported according to the release notes, but the feature is experimental.
  2. xformers complains that is not built for SM90. This is easily fixed by installing xformers from source. The issue seems to be packaging scripts here:
    export TORCH_CUDA_ARCH_LIST="3.7;5.0;5.2;6.0;6.1+PTX;7.0;7.5+PTX;8.0;8.6+PTX"
  3. Workaround is currently to install xformers from source which is very slow and painful. It would be great if the binaries could be updated to include support for H100 TORCH_CUDA_ARCH.

Expected behavior

Binaries include SM90 arch as well.

@danthe3rd
Copy link
Contributor

Hi,
Yes we want to support that in the next release hopefully. 2 things blocking us so far:

  • we will need to support cuda 12 for best performance on Sm90 - this has implications on the build process, because cuda 12 binaries won't be compatible with cuda 11.x pytorch maybe..?
  • we are limited in terms of space we can use in pypi. Adding an architecture will require more space...
    Project Limit Request: xformers - 30 GB pypi/support#2907

@bottler bottler closed this as completed Jun 25, 2024
@Skylion007
Copy link
Contributor Author

we will need to support cuda 12 for best performance on Sm90 - this has implications on the build process, because cuda 12 binaries won't be compatible with cuda 11.x pytorch maybe..?

SM90 works fine on CUDA 11.8

bertmaher pushed a commit to bertmaher/xformers that referenced this issue Dec 20, 2024
* Re-enable block-sparse tests on CI

* Bump tolerance for newer triton

* Fix test

This test was broken for a while but as the test wasn't active it wasn't caught
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants