Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: LoRA support for Pixtral #8802

Open
1 task done
Tracked by #4194
spring-anth opened this issue Sep 25, 2024 · 12 comments
Open
1 task done
Tracked by #4194

[Feature]: LoRA support for Pixtral #8802

spring-anth opened this issue Sep 25, 2024 · 12 comments

Comments

@spring-anth
Copy link

🚀 The feature, motivation and pitch

I have finetuned the linear layers of Pixtral on my own dataset and would like to host the LoRA adapters as is possible for Mistral. It would great if this would be supported in the future.

Related issue: #8685 as the base model I used for finetuning is the HF version

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@DarkLight1337
Copy link
Member

LoRA support for VLMs in general is still WIP. cc @jeejeelee

@jeejeelee
Copy link
Contributor

LoRA support for VLMs in general is still WIP. cc @jeejeelee

Thanks for the ping, I should be able to complete the temporary solution for VL support of LoRA this week.

@jeejeelee
Copy link
Contributor

@spring-anth I have completed the integration of Pixtral support for LoRA, see: https://github.com/jeejeelee/vllm/tree/pixtral-support-lora. Could you please verify this locally? I don't have the enough resources to train the LoRA model myself.

@spring-anth
Copy link
Author

@jeejeelee Thank you! I checked out your branch and set it as my current vllm implementation via git clone https://github.com/jeejeelee/vllm.git cd vllm python python_only_dev.py
unfortunately I get this ValueError: [rank0]: self.model = get_model(model_config=self.model_config, [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/docker/.local/lib/python3.11/site-packages/vllm/model_executor/model_loader/__init__.py", line 19, in get_model [rank0]: return loader.load_model(model_config=model_config, [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/docker/.local/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 399, in load_model [rank0]: model = _initialize_model(model_config, self.load_config, [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/docker/.local/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 176, in _initialize_model [rank0]: return build_model( [rank0]: ^^^^^^^^^^^^ [rank0]: File "/home/docker/.local/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 157, in build_model [rank0]: extra_kwargs = _get_model_initialization_kwargs(model_class, lora_config, [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/docker/.local/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 134, in _get_model_initialization_kwargs [rank0]: raise ValueError( [rank0]: ValueError: Model PixtralForConditionalGeneration does not support LoRA, but LoRA is enabled. Support for this model may be added in the future. If this is important to you, please open an issue on github.

@jeejeelee
Copy link
Contributor

@spring-anth Hi, which branch are you using? Is it pixtral-support-lora?

@spring-anth
Copy link
Author

@jeejeelee You were right, I was on the wrong branch, silly mistake. Unfortunately I currently can't test for you if your change works as I trained pixtral with the transformers-compatible version. Therefore I can only use the LoRA weights for Pixtral once the transformers version of Pixtral is supported in vLLM (which is work in progress). My current workaround is merging the weights and transforming the model back to the vLLM compatible version.

@jeejeelee
Copy link
Contributor

@spring-anth Currently, vLLM only supports LoRA trained with PEFT

@spring-anth
Copy link
Author

@jeejeelee Yes, that's not what I meant I did train with PEFT but the training is based on the HF Transformers version of Pixtral (https://huggingface.co/mistral-community/pixtral-12b) which uses a different structure than the vLLM supported version (https://huggingface.co/mistralai/Pixtral-12B-2409)

@tensimixt
Copy link

tensimixt commented Oct 22, 2024

@jeejeelee does this work with #5036? If so can test this week if inference with mistralai/Pixtral-12b-2409 with Lora Adapter works. Or should i use another PR to test vllm inference of this?

Or I can just use this branch https://github.com/jeejeelee/vllm/tree/pixtral-support-lora

Will this work for python -m vllm.entrypoints.openai.api_server where model is set to pixtral and lora modules to pixtral lora adapter?

Thank you!

@jeejeelee
Copy link
Contributor

@jeejeelee does this work with #5036? If so can test this week if inference with mistralai/Pixtral-12b-2409 with Lora Adapter works. Or should i use another PR to test vllm inference of this?

Or I can just use this branch https://github.com/jeejeelee/vllm/tree/pixtral-support-lora

Will this work for python -m vllm.entrypoints.openai.api_server where model is set to pixtral and lora modules to pixtral lora adapter?

Thank you!

I thinki it should work, see: https://docs.vllm.ai/en/latest/models/lora.html#serving-lora-adapters

@tensimixt
Copy link

tensimixt commented Oct 24, 2024

@jeejeelee when doing git checkout pixtral-support-lora and then pip install -e . does it build correctly for you or it crashes or does it very long time to build for you? when doing git checkout pr-5036 build vllm takes only 10-15 minutes, but for new vllm build take very long time been 45 minutes still buidling vllm is there way to make it faster build, thank you!

@jeejeelee
Copy link
Contributor

I also need to spend a long time, unless compiling on high-performance CPU servers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants