-
-
Couldn't load subscription status.
- Fork 10.8k
[Quantization] Pool model support bitsandbytes #18087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
|
Is there any existing model we can use to test this? |
We can use |
|
Can you add this test to the CI? |
|
The reason I didn't add it was due to concerns about CI pressure. If you think related tests should be added, I'll implement it ASAP. |
|
The quantization model test is conditional so it should be fine to add it |
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
|
||
| hf_model_kwargs = {"load_in_4bit": True} | ||
| hf_model_kwargs = dict(quantization_config=BitsAndBytesConfig( | ||
| load_in_4bit=True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is to avoid the warning below
The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead
I have added the test @DarkLight1337 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
It looks like we need to force merge |
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Yuqi Zhang <yuqizhang@google.com>
Test snippet