[CUTLASS] Add blockwise scale gemm/bmm kernels #17789

MasterJH5574 · 2025-03-29T15:27:49Z

We add unit tests for gemm and bmm. This PR also restores some cutlass gemm tests that were removed before during Relay phasing out.

This PR introduces blockwise scale matmul and batch matmul CUTLASS kernels, adapted from SGLang (http://github.com/sgl-project/sglang), vLLM (https://github.com/vllm-project/vllm) and https://github.com/soundOfDestiny/cutlass. We add unit tests for gemm and bmm. This PR also restores some cutlass gemm tests that were removed before during Relay phasing out.

MasterJH5574 force-pushed the tvm-dev/2025-03-29-cutlass-blockwise-gemm-bmm branch from fa7eeb8 to b25a365 Compare March 29, 2025 17:43

MasterJH5574 force-pushed the tvm-dev/2025-03-29-cutlass-blockwise-gemm-bmm branch from b25a365 to 5219852 Compare March 29, 2025 18:10

yongwww approved these changes Mar 31, 2025

View reviewed changes

yongwww merged commit b0ccfb3 into apache:main Mar 31, 2025
15 checks passed

ysh329 mentioned this pull request Apr 19, 2025

[Release] v0.20.0 Release Candidate Notes #17860

Closed

Provide feedback