Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
3c7d198
V1 support of priority shedualing
amitm02 Jun 3, 2025
bae0c39
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 3, 2025
4b2e513
style(docs): split long line and wrap paragraph in note admonition
amitm02 Jun 3, 2025
419de95
style(docs): fix pymarkdown error
amitm02 Jun 3, 2025
de76b7e
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 8, 2025
13d9f95
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 8, 2025
0a11d2d
Minor, remove comma
amitm02 Jun 8, 2025
f423ca9
fix priority tester
amitm02 Jun 8, 2025
15d8202
create a subclass of deque for waiting queue that has optional priori…
amitm02 Jun 8, 2025
7bcab39
formatting
amitm02 Jun 8, 2025
03b806c
mypy errors fix
amitm02 Jun 8, 2025
462c9fc
fix maximum recursion depth exceeded
amitm02 Jun 9, 2025
ccdee08
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 9, 2025
c016f49
refactor FCFSWaitingQueue to fix both __iter__ and maximum recursion …
amitm02 Jun 9, 2025
7c68b17
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 11, 2025
9ab18b4
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 11, 2025
a22f063
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 15, 2025
bc277f5
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 16, 2025
065432a
fix bug of using min instead of max for lowest-priority request
amitm02 Jun 16, 2025
7876e41
remove unnecessary check for self.running
amitm02 Jun 16, 2025
0252d41
fix extend_left_requests and use peek_request for checking conditions…
amitm02 Jun 16, 2025
7fc11f0
Update vllm/v1/core/sched/scheduler.py
amitm02 Jun 16, 2025
7523db3
rename WaitingQueue to RequestQueue
amitm02 Jun 16, 2025
2a2153f
Merge branch 'v1-priority-schedular' of https://github.com/amitm02/vl…
amitm02 Jun 16, 2025
ac25e10
use request.priority, request.arrival_time for heap
amitm02 Jun 16, 2025
2482db9
fix lint error
amitm02 Jun 16, 2025
627e11c
add more description to PriorityRequestQueue class
amitm02 Jun 16, 2025
8ee7353
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 17, 2025
c63201b
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 19, 2025
62620a0
rename extend_left_requests => prepend_requests; push_request => prep…
amitm02 Jun 19, 2025
6037313
fix proirity queue unti test by adding pooling_params=None
amitm02 Jun 19, 2025
05cc6b8
add __reversed__ to PriorityRequestQueue and remove list conversion
amitm02 Jun 19, 2025
86fe5bb
minor: remove trailing comma
amitm02 Jun 19, 2025
67da5fa
add remove_requests() to request queue interface and use it in finish…
amitm02 Jun 19, 2025
815da9f
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 19, 2025
1499053
use in prepend_request(s) to support requests with priorities of neg…
amitm02 Jun 19, 2025
c84af1e
minor change in comment
amitm02 Jun 19, 2025
e1f7e78
restore client_index=request.client_index
amitm02 Jun 19, 2025
fc3719c
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 19, 2025
9aa2aca
do not alter prioirty in PriorityRequestQueue prepend_request(s) func…
amitm02 Jun 19, 2025
2784e8f
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 20, 2025
f6fe731
minor style changes
amitm02 Jun 20, 2025
dd63a03
minor style changes
amitm02 Jun 20, 2025
9b10818
use enum for policy parameter in create_request_queue
amitm02 Jun 20, 2025
38cdbd5
use enum for policy parameter in create_request_queue pt. 2
amitm02 Jun 20, 2025
0a6d6b2
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 20, 2025
a2601b5
Merge remote-tracking branch 'upstream/main' into v1-priority-schedular
amitm02 Jun 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions docs/usage/v1_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,18 @@ For each item, our progress towards V1 support falls into one of the following s
- **🟠 Delayed**: Temporarily dropped in V1 but planned to be re-introduced later.
- **🔴 Deprecated**: Not planned for V1 unless there is strong demand.

!!! note
vLLM V1’s unified scheduler treats both prompt and output tokens the same
way by using a simple dictionary (e.g., `{request_id: num_tokens}`) to dynamically
allocate a fixed token budget per request, enabling features like chunked prefills,
prefix caching, and speculative decoding without a strict separation between prefill
and decode phases.

The V1 scheduler supports multiple scheduling policies, including First-Come,
First-Served (FCFS) and priority-based scheduling (where requests are processed
based on assigned priority, with FCFS as a tie-breaker), configurable via the
`--scheduling-policy` argument.

### Hardware

| Hardware | Status |
Expand Down
Loading