Commit 8554cb0
[Enhancement] Add a MXFP4 grouped GEMM example for FusedMoE (#811)
* [Enhancement] Enhance dequantization examples and utilities
- Added a new example for grouped matrix multiplication with experts in `example_dequant_groupgemm_bf16_mxfp4_hopper.py`.
- Improved dequantization logic in existing examples by replacing nested loops with vectorized operations for better performance.
- Updated `torch_convert_bit_twiddling` function in `utils.py` to utilize parallel processing, enhancing efficiency and clarity in the conversion process.
Co-authored-by: Zhengju Tang <97930865+tzj-fxz@users.noreply.github.com>
* fix typos in docstrings
* remove redundant code
* [Format] Unreproducible debug with T.print
* [BugFix] Correct dtype in ref dequantize; larger data distribution
* [Format]
* [Refactor] Clean up and optimize example_dequant_groupgemm_bf16_mxfp4_hopper.py and utils.py
- Removed unnecessary cache disabling and manual seed setting in the example.
- Simplified nested loops into parallelized operations for better readability and performance.
- Updated the assertion function in utils.py to print detailed error messages.
- Adjusted tensor sizes in examples
* [Refactor] Update import path in example_dequant_gemm_fine_grained.py
- Changed the import statement for `_tir_packed_to_unsigned_convert` from `bitblas.quantization` to `tilelang.quantize` to reflect the new module structure.
* lint
* rename and add test
* lint
* [Feature] Enhance autotuning and configuration generation in example_dequant_groupedgemm_bf16_mxfp4_hopper.py
- Added a new function `get_configs()` to generate hyperparameter configurations for tuning.
- Updated the `matmul` function to utilize autotuning with the new configurations.
- Improve kernel performance via vectorization and threadblock swizzle.
- Enhanced the main function to support the new autotuning inputs and updated parameters for better performance.
* lint
* fix typo
* fix typo and lint
* make ci format check happy
* fix ci
---------
Co-authored-by: Zhengju Tang <97930865+tzj-fxz@users.noreply.github.com>
Co-authored-by: tzj-fxz <tzjfxz@gmail.com>1 parent e4a346f commit 8554cb0
File tree
7 files changed
+603
-47
lines changed- examples/dequantize_gemm
- tilelang
- language
- quantize
7 files changed
+603
-47
lines changedLines changed: 4 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
389 | 389 | | |
390 | 390 | | |
391 | 391 | | |
392 | | - | |
393 | | - | |
394 | | - | |
| 392 | + | |
395 | 393 | | |
396 | 394 | | |
397 | 395 | | |
| |||
414 | 412 | | |
415 | 413 | | |
416 | 414 | | |
417 | | - | |
418 | | - | |
419 | | - | |
| 415 | + | |
420 | 416 | | |
421 | 417 | | |
422 | 418 | | |
| |||
440 | 436 | | |
441 | 437 | | |
442 | 438 | | |
443 | | - | |
444 | | - | |
445 | | - | |
| 439 | + | |
446 | 440 | | |
447 | 441 | | |
448 | 442 | | |
| |||
470 | 464 | | |
471 | 465 | | |
472 | 466 | | |
473 | | - | |
474 | | - | |
475 | | - | |
| 467 | + | |
476 | 468 | | |
477 | 469 | | |
478 | 470 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| |||
Lines changed: 511 additions & 0 deletions
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
| |||
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
34 | 42 | | |
35 | 43 | | |
36 | 44 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
7 | | - | |
8 | 6 | | |
9 | 7 | | |
10 | 8 | | |
| |||
16 | 14 | | |
17 | 15 | | |
18 | 16 | | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
19 | 20 | | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
43 | 25 | | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
51 | 57 | | |
52 | 58 | | |
53 | 59 | | |
| |||
106 | 112 | | |
107 | 113 | | |
108 | 114 | | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
331 | 331 | | |
332 | 332 | | |
333 | 333 | | |
334 | | - | |
| 334 | + | |
335 | 335 | | |
336 | 336 | | |
337 | 337 | | |
338 | 338 | | |
339 | 339 | | |
340 | | - | |
| 340 | + | |
341 | 341 | | |
342 | 342 | | |
343 | 343 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| |||
0 commit comments