-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage]: Can I use vllm.LLM(quantization="bitsandbytes"...) when bitsandbytes is supported in the v0.5.0 version #5480
Comments
but actually i didnt see it supported in
|
It seems that the bitsandbytes of VLLM currently only supports the llama model. |
ping @chenqianfzh |
Does it support llama3? |
I am not sure, I am trying to deal this with mixtral, it seems not working |
Currently, mixtrail does not support B&B, but llama3 should be able to. |
When I try to load
|
When loading Llama3-8B-Instruct I got garbage output: #5569 |
WHen u use arguents for engine like the below: |
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you! |
Your current environment
How would you like to use vllm
I want to run inference of a mixtral model qlora with bitsandbytes. I don't know how to integrate it with vllm.
The text was updated successfully, but these errors were encountered: