-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[DataType] BF16 Support #17670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DataType] BF16 Support #17670
Conversation
|
Hi @joshua-j-hong, thank you so much for the great contributions! Could you please add some PR descriptions so that CI tests can be triggered? |
|
@tvm-bot rerun |
|
Failed to re-run CI in https://github.com/apache/tvm/actions/runs/13505622336 Detailswith response |
|
@tvm-bot rerun |
e2ee869 to
5491f0c
Compare
MasterJH5574
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for rehabilitating BF16!
@joshua-j-hong Just want to check these two points. Have we confirmed that they are in good shape now? We are good to merge if these are confirmed. |
|
I've added the changes for quantization in MLC LLM (will put up the corresponding PR soon) and am currently testing with the original model, but am running into some local build issues. Hoping to resolve these tomorrow to close out this ticket! |
Allows BF16 for model datatypes in group quantization and adds Quantization settings for BF16. Corresponding PR in TVM is apache/tvm#17670
Adds general BF16 support to TVM - Addresses missing legalization in `comm_reducer` - Adds BF16 legalization pass skipping if the target supports BF16 - Unit Tests for `comm_reducer` changes as well as legalization skipping - Modifications to TVM datatypes to allow for `T.bfloat16` in the test file - Fixes for BFloat related cuda codegen Related PR in MLC-LLM adds BF16 support with quantization mlc-ai/mlc-llm#3158 Tested with the original problematic model Gemma 2 27b with both added quantization configurations `q4bf16_0` and `q4bf16_1`. While compilation is successful and the first few rounds of prompting have expected performance, we observe that for long contexts generation quality degrades. The same behavior isn't observed on Gemma 2 9B, quantized or unquantized --------- Co-authored-by: Joshua Hong <jjhong@andrew.cmu.edu>
Adds general BF16 support to TVM
comm_reducercomm_reducerchanges as well as legalization skippingT.bfloat16in the test fileRelated PR in MLC-LLM adds BF16 support with quantization mlc-ai/mlc-llm#3158
Tested with the original problematic model Gemma 2 27b with both added quantization configurations
q4bf16_0andq4bf16_1. While compilation is successful and the first few rounds of prompting have expected performance, we observe that for long contexts generation quality degrades. The same behavior isn't observed on Gemma 2 9B, quantized or unquantized