Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: --Chunked prefill can't be used together with num-scheduler-steps #8274

Open
1 task done
ndao600 opened this issue Sep 8, 2024 · 2 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@ndao600
Copy link

ndao600 commented Sep 8, 2024

Your current environment

vllm 0.6.0

🐛 Describe the bug

when I try this
vllm serve neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8 --host 0.0.0.0 --port 8000 --tensor-parallel-size 8 --seed 1234 --enable_prefix_caching --enable-chunked-prefill --max-model-len 32000 --num-scheduler-steps 8
I got this
raise ValueError("Chunked prefill is not supported with "
ValueError: Chunked prefill is not supported with multi-step (--num-scheduler-steps > 1)
ERROR 09-08 13:11:00 api_server.py:186] RPCServer process died before responding to readiness probe
is it an expected behavior?

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@ndao600 ndao600 added the bug Something isn't working label Sep 8, 2024
@robertgshaw2-neuralmagic
Copy link
Collaborator

Chunked prefill + multistep is a work in progress. Follow along here: #8001

@tjtanaa
Copy link
Contributor

tjtanaa commented Oct 8, 2024

@ndao600
Is this issue closed by
Core Feature PR: #8645
Bug Fixed PR: #9038

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants