Commit b231d21
PR #33794: [GPU] Support int4 in cuDNN GEMM fusions.
Imported from GitHub PR #33794
📝 Summary of Changes
Support int4 in cuDNN GEMM fusions.
🎯 Justification
Accelerates some int4 GEMM fusions (under the flag xla_gpu_cudnn_gemm_fusion_level).
🚀 Kind of Contribution
⚡️ Performance Improvement
📊 Benchmark (for Performance Improvements)
> Please measure and include speedups for one of the public HLOs in
`compiler/xla/tools/benchmarks/hlo/`.
These do not use int4.
🧪 Unit Tests:
yes
🧪 Execution Tests:
yes
Copybara import of the project:
--
e1b8dc7 by Ilia Sergachev <isergachev@nvidia.com>:
[GPU] Support int4 in cuDNN GEMM fusions.
Merging this change closes #33794
FUTURE_COPYBARA_INTEGRATE_REVIEW=#33794 from openxla:cudnn_gemm_int4 e1b8dc7
PiperOrigin-RevId: 8308943211 parent 9da6117 commit b231d21
File tree
4 files changed
+36
-7
lines changed- xla
- backends/gpu/codegen
- hlo/translate/hlo_to_mhlo
- service/gpu/transforms
4 files changed
+36
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
107 | | - | |
108 | 107 | | |
109 | 108 | | |
110 | 109 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
50 | 49 | | |
51 | 50 | | |
52 | 51 | | |
| |||
59 | 58 | | |
60 | 59 | | |
61 | 60 | | |
62 | | - | |
63 | | - | |
64 | 61 | | |
65 | 62 | | |
66 | 63 | | |
| |||
80 | 77 | | |
81 | 78 | | |
82 | 79 | | |
83 | | - | |
| 80 | + | |
84 | 81 | | |
85 | 82 | | |
86 | | - | |
87 | | - | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
88 | 86 | | |
| 87 | + | |
89 | 88 | | |
90 | 89 | | |
91 | 90 | | |
| |||
457 | 456 | | |
458 | 457 | | |
459 | 458 | | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
460 | 482 | | |
461 | 483 | | |
462 | 484 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
199 | 199 | | |
200 | 200 | | |
201 | 201 | | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
202 | 208 | | |
203 | 209 | | |
204 | 210 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
149 | 149 | | |
150 | 150 | | |
151 | 151 | | |
| 152 | + | |
| 153 | + | |
152 | 154 | | |
153 | 155 | | |
154 | 156 | | |
| |||
0 commit comments