Skip to content

Conversation

@copybara-service
Copy link

PR #33794: [GPU] Support int4 in cuDNN GEMM fusions.

Imported from GitHub PR #33794

📝 Summary of Changes
Support int4 in cuDNN GEMM fusions.

🎯 Justification
Accelerates some int4 GEMM fusions (under the flag xla_gpu_cudnn_gemm_fusion_level).

🚀 Kind of Contribution
⚡️ Performance Improvement

📊 Benchmark (for Performance Improvements)

Please measure and include speedups for one of the public HLOs in
compiler/xla/tools/benchmarks/hlo/.

These do not use int4.

🧪 Unit Tests:
yes

🧪 Execution Tests:
yes
Copybara import of the project:

--
e1b8dc7 by Ilia Sergachev isergachev@nvidia.com:

[GPU] Support int4 in cuDNN GEMM fusions.

Merging this change closes #33794

FUTURE_COPYBARA_INTEGRATE_REVIEW=#33794 from openxla:cudnn_gemm_int4 e1b8dc7

Imported from GitHub PR #33794

📝 Summary of Changes
Support int4 in cuDNN GEMM fusions.

🎯 Justification
Accelerates some int4 GEMM fusions (under the flag xla_gpu_cudnn_gemm_fusion_level).

🚀 Kind of Contribution
⚡️ Performance Improvement

📊 Benchmark (for Performance Improvements)
> Please measure and include speedups for one of the public HLOs in
`compiler/xla/tools/benchmarks/hlo/`.

These do not use int4.

🧪 Unit Tests:
yes

🧪 Execution Tests:
yes
Copybara import of the project:

--
e1b8dc7 by Ilia Sergachev <isergachev@nvidia.com>:

[GPU] Support int4 in cuDNN GEMM fusions.

Merging this change closes #33794

COPYBARA_INTEGRATE_REVIEW=#33794 from openxla:cudnn_gemm_int4 e1b8dc7
PiperOrigin-RevId: 831264661
@copybara-service copybara-service bot merged commit 09464f6 into main Nov 12, 2025
@copybara-service copybara-service bot deleted the test_830894321 branch November 12, 2025 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant