-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
[Core] feat: Implement Priority Scheduling in V1 Engine #18700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
dbdfa5b to
8b54316
Compare
Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Signed-off-by: amit <amit.man@gmail.com>
Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com> Signed-off-by: amit <amit.man@gmail.com>
Signed-off-by: amit <amit.man@gmail.com>
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai> Signed-off-by: amit <amit.man@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: amit <amit.man@gmail.com>
…M with arbitrary components (vllm-project#18987) Signed-off-by: isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: amit <amit.man@gmail.com>
Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com> Signed-off-by: amit <amit.man@gmail.com>
…llm-project#18968) Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: amit <amit.man@gmail.com>
…ct#18992) Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: amit <amit.man@gmail.com>
Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com> Signed-off-by: amit <amit.man@gmail.com>
Signed-off-by: amit <amit.man@gmail.com>
Signed-off-by: amit <amit.man@gmail.com>
…m02/vllm into feat/v1-priority-scheduling
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: amit <amit.man@gmail.com>
Signed-off-by: amit <amit.man@gmail.com>
|
the commit history is in a mess, can you clean it up? maybe open another PR? |
Re-submitted as #19057 |
This commit introduces priority scheduling capabilities to the V1 LLM engine.
Key changes include:
EngineCoreRequestandRequestupdates:priorityfield toEngineCoreRequestandRequestclasses to carry priority information.Processorupdate:Processor.process_inputsto accept and pass theprioritytoEngineCoreRequest.V1
Schedulermodifications:--scheduling-policyargument.policy="priority",self.waitingis managed as a min-heap, prioritizing requests by their assigned priority value (lower value means higher priority) and then by arrival time (FCFS).policy="fcfs".Documentation:
docs/usage/v1_guide.mdanddocs/serving/openai_compatible_server.mdto reflect V1 engine's support for priority scheduling.Unit Tests:
tests/v1/core/test_scheduler.py.This allows you to influence the order of request processing in the V1 engine by assigning priorities, which is particularly useful in scenarios with varying request importance.
FIX #14002