-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[SupportsQuant] Bert, Blip, Blip2, Bloom #15573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SupportsQuant] Bert, Blip, Blip2, Bloom #15573
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
|
Btw to avoid regressions, do we have an easy way to test whether the model can load quantized weights successfully? Perhaps using dummy weights? cc @mgoin |
9151360 to
dbca650
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
dbca650 to
1b67297
Compare
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Signed-off-by: xinyuxiao <xinyuxiao2024@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
Purpose
SupportsQuantmixin to models in order to uniformity support quantization across all modelspacked_modules_mappingis correctly updated across nested modelsignored modulesare correctly updated according tohf_to_vllm_mapperand across nested modelsRelated Issues
Changes
SupportsQuantandpacked_modules_mappingattribute to bert, blip, blip2, and bloom