Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: When using lora and setting num-scheduler-steps simultaneously, the output does not meet expectations. #11086

Closed
1 task done
luoling1993 opened this issue Dec 11, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@luoling1993
Copy link

luoling1993 commented Dec 11, 2024

Your current environment

The output of `python collect_env.py`
Your output of `python collect_env.py` here

Model Input Dumps

No response

🐛 Describe the bug

VLLM version: 0.6.4.post1
I have trained a LoRA model based on Qwen2.5-7B-Instruct, and I have started the vllm service using pm2 with the following configuration:

apps:
  - name: "vllm"
    script: "/home/lucas/envs/nlp-vllm/bin/python"
    args:
      - "-m"
      - "vllm.entrypoints.openai.api_server"
      - "--port=18101"  # 端口设置

      # # Meta-Llama-3.1-8B-Instruct
      # - "--served-model-name=Meta-Llama-3.1-8B-Instruct"
      # - "--model=/data/llms/Meta-Llama-3.1-8B-Instruct"
      # - "--tokenizer=/data/llms/Meta-Llama-3.1-8B-Instruct"

      # qwen
      - "--served-model-name=Qwen2.5-7B-Instruct"
      - "--model=/data/llms/Qwen2.5-7B-Instruct"
      - "--tokenizer=/data/llms/Qwen2.5-7B-Instruct"

      # - "--max-model-len=8192"  # 最大模型长度
      - "--max-model-len=4096"
      - "--gpu-memory-utilization=0.9"  # GPU内存利用率

      # speedup
      # - "--enable-chunked-prefill"  # NOTE: LoRA is not supported with chunked prefill yet
      - "--enable-prefix-caching"
      # - "--num-scheduler-steps=8" # NOTE: LoRA, will always use base model(BUG)

      - "--enable-lora"
      - "--max-lora-rank=64"
      - "--lora-modules"
      # - '{"name": "nl2filter", "path": "/home/lucas/workspace/github_project/LLaMA-Factory/saves/Meta-Llama-3.1-8B-Instruct/lora/nl2filter-all", "base_model_name": "Meta-Llama-3.1-8B-Instruct"}'
      - '{"name": "nl2filter", "path": "/home/lucas/workspace/github_project/LLaMA-Factory/saves/Qwen2.5-7B-Instruct/lora/nl2filter-all", "base_model_name": "Qwen2.5-7B-Instruct"}'
   
    env:
      CUDA_VISIBLE_DEVICES: "0"

    log_date_format: "YYYY-MM-DD HH:mm:ss"
    error_file: "/home/lucas/workspace/pm2_logs/error.log"
    out_file: "/home/lucas/workspace/pm2_logs/out.log"

When calling, I use model_name=nl2filter. Everything works fine when the num-scheduler-steps parameter is not set. However, when setting --num-scheduler-steps=8, the service starts up normally and the call also returns, but the result returned is not from the LoRA model. Instead, it is from the base model, which is Qwen2.5-7B-Instruct without any LoRA modifications. There are no errors or warnings.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@luoling1993 luoling1993 added the bug Something isn't working label Dec 11, 2024
@jeejeelee
Copy link
Collaborator

jeejeelee commented Dec 11, 2024

` --enable-chunked-prefill" # NOTE: LoRA is not supported with chunked prefill yet ·

Just supported, it should be reflected in the next version, see: #9057

@jeejeelee
Copy link
Collaborator

jeejeelee commented Dec 11, 2024

# - "--num-scheduler-steps=8" # NOTE: LoRA, will always use base model(BUG)

I remember there was a bug here, and there was a PR for it - let me look for it, see: #9689

jeejeelee added a commit that referenced this issue Feb 1, 2025
…11161)

FIX issue #9688
#11086 #12487

---------

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: weilong.yu <weilong.yu@shopee.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Isotr0py pushed a commit to Isotr0py/vllm that referenced this issue Feb 2, 2025
…llm-project#11161)

FIX issue vllm-project#9688
vllm-project#11086 vllm-project#12487

---------

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: weilong.yu <weilong.yu@shopee.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
youngkent pushed a commit to youngkent/vllm that referenced this issue Feb 3, 2025
…llm-project#11161)

FIX issue vllm-project#9688
vllm-project#11086 vllm-project#12487

---------

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: weilong.yu <weilong.yu@shopee.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
srikanthsrnvs pushed a commit to srikanthsrnvs/vllm that referenced this issue Feb 3, 2025
…llm-project#11161)

FIX issue vllm-project#9688
vllm-project#11086 vllm-project#12487

---------

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: weilong.yu <weilong.yu@shopee.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Srikanth Srinivas <srikanth@astrum.ai>
sahelib25 pushed a commit to krai/vllm that referenced this issue Feb 3, 2025
…llm-project#11161)

FIX issue vllm-project#9688
vllm-project#11086 vllm-project#12487

---------

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: weilong.yu <weilong.yu@shopee.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
fxmarty-amd pushed a commit to fxmarty-amd/vllm that referenced this issue Feb 7, 2025
…llm-project#11161)

FIX issue vllm-project#9688
vllm-project#11086 vllm-project#12487

---------

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: weilong.yu <weilong.yu@shopee.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Felix Marty <felmarty@amd.com>
NickLucche pushed a commit to NickLucche/vllm that referenced this issue Feb 7, 2025
…llm-project#11161)

FIX issue vllm-project#9688
vllm-project#11086 vllm-project#12487

---------

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: weilong.yu <weilong.yu@shopee.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
ShangmingCai pushed a commit to ShangmingCai/vllm that referenced this issue Feb 10, 2025
…llm-project#11161)

FIX issue vllm-project#9688
vllm-project#11086 vllm-project#12487

---------

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: weilong.yu <weilong.yu@shopee.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
GWS0428 pushed a commit to GWS0428/VARserve that referenced this issue Feb 12, 2025
…llm-project#11161)

FIX issue vllm-project#9688
vllm-project#11086 vllm-project#12487

---------

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: weilong.yu <weilong.yu@shopee.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants