Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unsloth with vllm in 8/4 bits #253

Closed
quancore opened this issue Mar 16, 2024 · 21 comments
Closed

unsloth with vllm in 8/4 bits #253

quancore opened this issue Mar 16, 2024 · 21 comments
Labels
currently fixing Am fixing now! URGENT BUG Urgent bug

Comments

@quancore
Copy link

I have trained qlora model with unsloth and I want to serve with vllm but I did not found a way to serve model in8/4 bits ?

@danielhanchen
Copy link
Contributor

@quancore I'm not sure / unsure if vLLM allows serving in 4 or 8 bits!
16bit yes, but unsure on 4 or 8

@quancore
Copy link
Author

@danielhanchen I think it is: vllm-project/vllm#1155

@patleeman
Copy link

@danielhanchen I think it is: vllm-project/vllm#1155

Looks like they only support AWQ quantization not via bitsandbytes.

@danielhanchen
Copy link
Contributor

@patleeman Oh ye AWQ is great - I'm assuming you want to quantize it to AWQ?

@quancore
Copy link
Author

@patleeman @danielhanchen well yes, maybe we should support AWQ so we can use qlora models with vllm?

@marcelodiaz558
Copy link

Hello there. I am also interested in using with VLLM a 8/4 bits model trained with Unsloth. Currently, it works fine with 16 bits but requires too much VRAM. Is there a way to quantize a model trained with Unsloth using AWQ or GPTQ?

@Karry11 Karry11 mentioned this issue May 15, 2024
@danielhanchen
Copy link
Contributor

Whoops this missed me - yep having an option to convert it to AWQ is interesting

@Louis2B2G
Copy link

Whoops this missed me - yep having an option to convert it to AWQ is interesting

That would be amazing - is this a feature you are planning on adding in the near future?

@danielhanchen
Copy link
Contributor

danielhanchen commented Jun 6, 2024

Yep for a future release!

@amir-in-a-cynch
Copy link

I'm down to volunteer to work on this, if you're accepting community contributions. (I have to do this for my day job anyway, so it might be nice to contribute to the library.)

@Serega6678
Copy link

@amir-in-a-cynch do you plan to do it?

@amir-in-a-cynch
Copy link

@amir-in-a-cynch do you plan to do it?

I'll take a stab at it tomorrow and wednesday. Not sure if it'll end up being a clean integration to the API for this library (since it adds a dependency), but at the worst case we should be able to get an example notebook together on how to do it for the docs.

@Serega6678
Copy link

@amir-in-a-cynch great, keep me in touch
I don't mind giving you a helping hand if you're stuck at some point

@danielhanchen
Copy link
Contributor

I think vLLM exporting to 8bits is through AWQ - you can also enable float8 support (if your GPU supports it)

@BBiering
Copy link

BBiering commented Sep 9, 2024

@amir-in-a-cynch @danielhanchen Is there any update on this feature? Would be great to be able to use Unsloth quantized models with vLLM.

@danielhanchen
Copy link
Contributor

Actually I think vLLM added 4bit quants - I need to check it out - I'll make some script fro this!

@danielhanchen danielhanchen added currently fixing Am fixing now! URGENT BUG Urgent bug labels Sep 10, 2024
@frei-x
Copy link

frei-x commented Sep 25, 2024

unsloth AttributeError: Model Qwen2ForCausalLM does not support BitsAndBytes quantization yet.

@danielhanchen
Copy link
Contributor

@frei-x Oh it should function now hopefully? Please update Unsloth! Sorry on the delay as well!

pip uninstall unsloth -y
pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

@nandagopal1992
Copy link

@danielhanchen does this mean , the latest version has support for vllm with 4 / 8 bits?

Btw amazing work here :)

@danielhanchen
Copy link
Contributor

@nandagopal1992 I'm pretty certain vLLM can load 4 bit bitsandbytes modules now

@shimmyshimmer
Copy link
Collaborator

Now supported! :) Let us know if you still have any issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
currently fixing Am fixing now! URGENT BUG Urgent bug
Projects
None yet
Development

No branches or pull requests