[Misc]: LoRA request with Multi GPU does not provide correct responses with num_scheduler_steps config #12487

soodrohit · 2025-01-27T20:41:23Z

Anything you want to discuss about vllm.

Hello All,

We are encountering a strange issue with our LoRA adapter, when running in multi-GPU setup.

Context:
Base model: Mistral Nemo 12B (https://huggingface.co/nvidia/Mistral-NeMo-12B-Instruct)
Adapter Rank: 8

Vllm Model.json

{
    "model": "/model-store/backbone/Mistral-Nemo-Base",
    "disable_log_requests": "true",
    "gpu_memory_utilization": 0.85,
    "max_model_len": 16000,
    "tensor_parallel_size": 2,
    "distributed_executor_backend": "ray",
    "enable_lora": "true",
    "max_lora_rank": 8,
    "max_loras": 4,
    "trust_remote_code": "true"
}

Multi-lora.json

{
    "t2f": "/model-store/backbone/loras/Mistral-Nemo-Base-t2f-lora"
}

Now, when we add the num_scheduler_steps configuration to the model.json,

 "num_scheduler_steps": 8,

Now the adapter responds with correct response when we don't have 'num_scheduler_steps' in the multi-GPU setup, but when we add this configuration, we don't get the correct response from the adapter any longer, even though everything else remains same.

We are looking at the response from the LoRA targeted request here not the response from Base model.

Has anyone faced similar issue, are there any setting, configurations that need to be made to enable multi-gpu LoRA requests?

Thanks,
Rohit

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

jeejeelee · 2025-01-28T03:56:40Z

See: #11161

…11161) FIX issue #9688 #11086 #12487 --------- Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: weilong.yu <weilong.yu@shopee.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>

…llm-project#11161) FIX issue vllm-project#9688 vllm-project#11086 vllm-project#12487 --------- Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: weilong.yu <weilong.yu@shopee.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Isotr0py <2037008807@qq.com>

…llm-project#11161) FIX issue vllm-project#9688 vllm-project#11086 vllm-project#12487 --------- Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: weilong.yu <weilong.yu@shopee.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>

…llm-project#11161) FIX issue vllm-project#9688 vllm-project#11086 vllm-project#12487 --------- Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: weilong.yu <weilong.yu@shopee.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Srikanth Srinivas <srikanth@astrum.ai>

…llm-project#11161) FIX issue vllm-project#9688 vllm-project#11086 vllm-project#12487 --------- Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: weilong.yu <weilong.yu@shopee.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>

…llm-project#11161) FIX issue vllm-project#9688 vllm-project#11086 vllm-project#12487 --------- Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: weilong.yu <weilong.yu@shopee.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Felix Marty <felmarty@amd.com>

…llm-project#11161) FIX issue vllm-project#9688 vllm-project#11086 vllm-project#12487 --------- Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: weilong.yu <weilong.yu@shopee.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>

soodrohit added the misc label Jan 27, 2025

jeejeelee mentioned this issue Jan 29, 2025

[BugFix] fix wrong output when using lora and num_scheduler_steps=8 #11161

Merged

jeejeelee closed this as completed Feb 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc]: LoRA request with Multi GPU does not provide correct responses with num_scheduler_steps config #12487

[Misc]: LoRA request with Multi GPU does not provide correct responses with num_scheduler_steps config #12487

soodrohit commented Jan 27, 2025 •

edited

Loading

jeejeelee commented Jan 28, 2025

[Misc]: LoRA request with Multi GPU does not provide correct responses with num_scheduler_steps config #12487

[Misc]: LoRA request with Multi GPU does not provide correct responses with num_scheduler_steps config #12487

Comments

soodrohit commented Jan 27, 2025 • edited Loading

Anything you want to discuss about vllm.

Before submitting a new issue...

jeejeelee commented Jan 28, 2025

soodrohit commented Jan 27, 2025 •

edited

Loading