support load qwen2-72b-instruct lora #5498

NiuBlibing · 2024-06-13T09:53:43Z

Like #4007, to support qwen2-72b-instruct's lora adapter with 1,2,4,8 tp-size.

NiuBlibing · 2024-06-14T06:55:22Z

Currntly punica kernel cannot support Qwen2-72B-Instruct because of 3696 could not be divided by 64. Hope #5036 or #5356 will work.

jeejeelee · 2024-06-14T07:38:03Z

Could you provide your running script?

I can test Qwen2-72B-Instruct+LoRA on my local device using #5036.

NiuBlibing · 2024-06-14T07:43:13Z

Could you provide your running script?

I can test Qwen2-72B-Instruct+LoRA on my local device using #5356.

I just start it with vllm cli.

python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-72B-Chat-test --model ./Qwen/Qwen2-72B-Instruct/ --gpu-memory-utilization 0.9 --tensor-parallel-size 8 --enable-lora --lora-dtype bfloat16 --lora-modules test=/path/to/lora/

jeejeelee · 2024-06-14T16:22:37Z

Could you provide your running script?
I can test Qwen2-72B-Instruct+LoRA on my local device using #5356.

I just start it with vllm cli.

python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-72B-Chat-test --model ./Qwen/Qwen2-72B-Instruct/ --gpu-memory-utilization 0.9 --tensor-parallel-size 8 --enable-lora --lora-dtype bfloat16 --lora-modules test=/path/to/lora/

Sorry, Actually, #5036 was used for the testing.

I have completed the test, #5036 can resolve this issue.

However, there are still some other issues that need to be resolved with #5036, I will process ASAP

NiuBlibing added 2 commits June 13, 2024 17:49

Add 3696 bgmv-kernel to support qwen2-72b-instruct lora

492ba85

add missing value

ecafe7e

NiuBlibing changed the title ~~Add 3696 bgmv-kernel to support qwen2-72b-instruct lora~~ Add 3696 bgmv-kernel to support qwen2-72b-instruct lora with tp 8 Jun 13, 2024

add missing values

09486db

NiuBlibing changed the title ~~Add 3696 bgmv-kernel to support qwen2-72b-instruct lora with tp 8~~ Add 3696 bgmv-kernel to support qwen2-72b-instruct lora Jun 13, 2024

NiuBlibing changed the title ~~Add 3696 bgmv-kernel to support qwen2-72b-instruct lora~~ support load qwen2-72b-instruct lora Jun 13, 2024

NiuBlibing marked this pull request as draft June 13, 2024 10:33

NiuBlibing closed this Jun 13, 2024

NiuBlibing reopened this Jun 13, 2024

NiuBlibing closed this Jun 13, 2024

NiuBlibing reopened this Jun 14, 2024

NiuBlibing closed this Jun 14, 2024

NiuBlibing reopened this Jun 14, 2024

NiuBlibing closed this Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support load qwen2-72b-instruct lora #5498

support load qwen2-72b-instruct lora #5498

NiuBlibing commented Jun 13, 2024 •

edited

Loading

NiuBlibing commented Jun 14, 2024

jeejeelee commented Jun 14, 2024 •

edited

Loading

NiuBlibing commented Jun 14, 2024

jeejeelee commented Jun 14, 2024

support load qwen2-72b-instruct lora #5498

support load qwen2-72b-instruct lora #5498

Conversation

NiuBlibing commented Jun 13, 2024 • edited Loading

NiuBlibing commented Jun 14, 2024

jeejeelee commented Jun 14, 2024 • edited Loading

NiuBlibing commented Jun 14, 2024

jeejeelee commented Jun 14, 2024

NiuBlibing commented Jun 13, 2024 •

edited

Loading

jeejeelee commented Jun 14, 2024 •

edited

Loading