[V1][Perf] Simpler request output queues #15156

njhill · 2025-03-19T19:57:53Z

Queue operations showed up when profiling high qps.

Since we coalesce RequestOutput objects, we don't need to use an actual queue.

This changes to merge the outputs when added rather than when removed.

Since we coalesce RequestOutput objects we don't need to use an actual queue. This changes to merge the outputs when added rather than when removed. Signed-off-by: Nick Hill <nhill@redhat.com>

github-actions · 2025-03-19T19:58:03Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

vllm/v1/engine/output_processor.py

mergify · 2025-03-21T07:36:06Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @njhill.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

houseroad

Looks good to me. Wondering if we should have some e2e test?

Signed-off-by: Nick Hill <nhill@redhat.com> # Conflicts: # vllm/v1/engine/async_llm.py # vllm/v1/engine/llm_engine.py # vllm/v1/engine/parallel_sampling.py

comaniac

LGTM. Only a nit. A unit test is definitely nice to have.

vllm/v1/engine/output_processor.py

robertgshaw2-redhat · 2025-03-24T16:36:30Z

vllm/v1/engine/output_processor.py

+        else:
+            self.output = output
+
+    async def get(self) -> RequestOutput:


Do you think we should have an invariant that output is not None if self.ready.wait() is true?

That is the case but I'm not sure what you're suggesting to add here? self.ready.wait() just waits for the condition to be set, it can only ever return True (not even sure why it returns that rather than None). And then we immediately check self.output again before continuing.

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

added unit test

Signed-off-by: Nick Hill <nhill@redhat.com>

…s-output

njhill · 2025-03-24T18:20:37Z

Thanks for adding a test @robertgshaw2-redhat! This should be good to merge now once the CI finishes.

robertgshaw2-redhat · 2025-03-24T19:29:04Z

Looks good to me. Wondering if we should have some e2e test?

Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Signed-off-by: Wes Medford <wryanmedford@gmail.com>

Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>

Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>

[V1][Perf] Simpler request output queues

e852802

Since we coalesce RequestOutput objects we don't need to use an actual queue. This changes to merge the outputs when added rather than when removed. Signed-off-by: Nick Hill <nhill@redhat.com>

njhill requested review from WoosukKwon, alexm-redhat, comaniac, robertgshaw2-redhat and ywang96 as code owners March 19, 2025 19:57

mergify bot added the v1 label Mar 19, 2025

njhill mentioned this pull request Mar 19, 2025

[BugFix][V1] Fix parallel sampling finishing/aborts #14512

Merged

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 19, 2025

houseroad reviewed Mar 21, 2025

View reviewed changes

vllm/v1/engine/output_processor.py Show resolved Hide resolved

mergify bot added the needs-rebase label Mar 21, 2025

houseroad reviewed Mar 21, 2025

View reviewed changes

Merge remote-tracking branch 'origin/main' into queueless-output

8fe1e45

Signed-off-by: Nick Hill <nhill@redhat.com> # Conflicts: # vllm/v1/engine/async_llm.py # vllm/v1/engine/llm_engine.py # vllm/v1/engine/parallel_sampling.py

mergify bot removed the needs-rebase label Mar 21, 2025

comaniac approved these changes Mar 21, 2025

View reviewed changes

vllm/v1/engine/output_processor.py Show resolved Hide resolved

njhill added the needs-tests Tests needed for this PR label Mar 24, 2025

robertgshaw2-redhat reviewed Mar 24, 2025

View reviewed changes

robertgshaw2-redhat and others added 4 commits March 24, 2025 17:47

added unit test

47e611d

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

removed stray file

af4e13b

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

updated

7382f62

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

Merge pull request #5 from robertgshaw2-redhat/add-test

12b2758

added unit test

njhill removed the needs-tests Tests needed for this PR label Mar 24, 2025

njhill added 2 commits March 24, 2025 11:18

Update docstring with more detail

639386c

Signed-off-by: Nick Hill <nhill@redhat.com>

Merge remote-tracking branch 'refs/remotes/origin/main' into queueles…

4612dc5

…s-output

robertgshaw2-redhat enabled auto-merge (squash) March 24, 2025 19:28

robertgshaw2-redhat closed this Mar 24, 2025

auto-merge was automatically disabled March 24, 2025 19:29
Pull request was closed

robertgshaw2-redhat reopened this Mar 24, 2025

robertgshaw2-redhat enabled auto-merge (squash) March 24, 2025 19:30

robertgshaw2-redhat merged commit 9d72daf into vllm-project:main Mar 24, 2025
36 of 38 checks passed

njhill deleted the queueless-output branch March 24, 2025 22:44

ckhordiasma mentioned this pull request Apr 17, 2025

[do not merge] pr test for nm changes into 2.20 red-hat-data-services/vllm#107

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[V1][Perf] Simpler request output queues #15156

[V1][Perf] Simpler request output queues #15156

Uh oh!

njhill commented Mar 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 19, 2025

Uh oh!

Uh oh!

mergify bot commented Mar 21, 2025

Uh oh!

houseroad left a comment

Uh oh!

comaniac left a comment

Uh oh!

Uh oh!

robertgshaw2-redhat Mar 24, 2025

Uh oh!

njhill Mar 24, 2025 •

edited

Loading

Uh oh!

njhill commented Mar 24, 2025

Uh oh!

robertgshaw2-redhat commented Mar 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[V1][Perf] Simpler request output queues #15156

[V1][Perf] Simpler request output queues #15156

Uh oh!

Conversation

njhill commented Mar 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 19, 2025

Uh oh!

Uh oh!

mergify bot commented Mar 21, 2025

Uh oh!

houseroad left a comment

Choose a reason for hiding this comment

Uh oh!

comaniac left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

robertgshaw2-redhat Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

njhill Mar 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

njhill commented Mar 24, 2025

Uh oh!

robertgshaw2-redhat commented Mar 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

njhill commented Mar 19, 2025 •

edited by github-actions bot

Loading

njhill Mar 24, 2025 •

edited

Loading