Skip to content

Conversation

@celsowm
Copy link

@celsowm celsowm commented Jun 24, 2025

This change introduces the necessary compiler flags and CMake configurations to enable support for the Nvidia Blackwell SM120 architecture.

  • Modified deep_gemm/jit/compiler.py to include sm_120 and compute_120 flags for NVCC and NVRTC.
  • Updated CMakeLists.txt to add the new architecture flags for the build process.

Further testing on Blackwell hardware is required to validate MMA instruction compatibility and overall performance.

This change introduces the necessary compiler flags and CMake configurations to enable support for the Nvidia Blackwell SM120 architecture.

- Modified deep_gemm/jit/compiler.py to include sm_120 and compute_120 flags for NVCC and NVRTC.
- Updated CMakeLists.txt to add the new architecture flags for the build process.

Further testing on Blackwell hardware is required to validate MMA instruction compatibility and overall performance.
@LyricZhao
Copy link
Collaborator

Thanks! But these compilation flag features are included in #112. Even if you have SM120 GEMM impls, we will also merge after #112.

@LyricZhao LyricZhao closed this Jul 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants