Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BYOC] Add GEMM kernel from FasterTransformer as submodule #15046

Merged
merged 2 commits into from
Jun 10, 2023

Conversation

masahi
Copy link
Member

@masahi masahi commented Jun 6, 2023

I extracted fp16 A - int8/4 GEMM kernel from FasterTransformer (see NVIDIA/cutlass#911) to make it easier to build and integrate into TVM. The code has been extracted and cleaned in the repo under tlc-pack and it is being added as a submodule.

A follow-up PR will update the CUTLASS BYOC to support offloading to this kernel. It is going to be useful for weight-quantized LLM inference.

Please review the license stuff etc @tqchen @junrushao @vinx13 @sunggg

@tvm-bot
Copy link
Collaborator

tvm-bot commented Jun 6, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@masahi
Copy link
Member Author

masahi commented Jun 7, 2023

hmm the cutlass revision which is submoduled by https://github.com/tlc-pack/cutlass_fpA_intB_gemm is not pulled by the CI apparently. Does anyone know how to tell the CI to do git submodule update --init --recursive?

Copy link
Member

@yzh119 yzh119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yzh119 yzh119 merged commit d8e5812 into apache:main Jun 10, 2023
junrushao pushed a commit to junrushao/tvm that referenced this pull request Jun 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants