[RFC]: Replaceable Scheduler #7123

NadavShmayo · 2024-08-04T11:28:32Z

Motivation.

The default scheduler is functioning well for the basic use case of serving with maximum throughput.
There are still some use cases in which we prioritize other metrics before maximum throughput, for example maintaining fairness between different users.

I specifically have a use case in which I have an application that uses vLLM, and tries to maintain fairness between requests of different users of the application.
By making the scheduler component more abstract and replaceable (perhaps also pluginable) we can allow such use case without having to change the scheduler logic to support each of these use cases.

Proposed Change.

I propose 2 different solutions, one of which may be hard to implement, but allows anyone to implement any scheduling logic they wish without changing any other core logic. The other is simple to implement but doesn't allow full control of the scheduler logic, and the other may be harder to implement but .

Solution 1 - Scheduler plugins

This solution requires defining an abstract base class of a scheduler, and allowing to pass the desired scheduler implementation file path as a CLI argument (or an environment variable).
This idea could also serve as the basis of scheduler plugins - meaning anyone could implement their own scheduler as a package separate from core vLLM, which allows for great extensibility and modularity.

Solution 2 - Support voluntary preemption hooks

This solution is less flexible but should still allow support for most scheduling logic.
This solution means that the Scheduler class should expose public methods for preempt/suspend and resume a SequenceGroup, and then the API can add routes to expose these methods.
This way we allow applications wrapping vLLM to implement their own complex scheduling logic, to give each user it's fair share of scheduling, or any other desired scheduling logic.

Feedback Period.

No response

CC List.

No response

Any Other Things.

Just to make it clear, I'll be happy to implement this, but I want hear some feedback before I go ahead and implement this.

The text was updated successfully, but these errors were encountered:

njhill · 2024-08-06T21:33:43Z

FYI @apatke @saurabhjha1

apatke · 2024-08-07T12:16:34Z

Regarding Solution 2, PTAL at #6077 and let us know if you have any feedback

NadavShmayo added the RFC label Aug 4, 2024

youkaichao mentioned this issue Aug 4, 2024

[RFC]: vLLM plugin system #7131

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC]: Replaceable Scheduler #7123

[RFC]: Replaceable Scheduler #7123

NadavShmayo commented Aug 4, 2024 •

edited

Loading

njhill commented Aug 6, 2024

apatke commented Aug 7, 2024

[RFC]: Replaceable Scheduler #7123

[RFC]: Replaceable Scheduler #7123

Comments

NadavShmayo commented Aug 4, 2024 • edited Loading

Motivation.

Proposed Change.

Solution 1 - Scheduler plugins

Solution 2 - Support voluntary preemption hooks

Feedback Period.

CC List.

Any Other Things.

njhill commented Aug 6, 2024

apatke commented Aug 7, 2024

NadavShmayo commented Aug 4, 2024 •

edited

Loading