[Ray] Improve documentation on batch inference #16609

richardliaw · 2025-04-14T18:19:43Z

This commit updates the batch inference example to leverage Ray Data's new
native vLLM integration (ray.data.llm) introduced in Ray 2.44. Changes include:

Replace custom LLMPredictor implementation with Ray Data's built-in vLLM processor
Add configuration for continuous batching and other vLLM optimizations
Include batch inference example in CI pipeline tests

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

github-actions · 2025-04-14T18:19:52Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Signed-off-by: Yang Wang <elainewy@meta.com>

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>

fix

ade57e8

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

update

f0b085b

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

mergify bot added documentation Improvements or additions to documentation ci/build labels Apr 14, 2025

richardliaw added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 14, 2025

simon-mo approved these changes Apr 14, 2025

View reviewed changes

simon-mo enabled auto-merge (squash) April 14, 2025 19:08

auto-merge was automatically disabled April 14, 2025 23:44
Head branch was pushed to by a user without write access

requirements

870fd44

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

richardliaw force-pushed the ray-improvements branch from c70d800 to 870fd44 Compare April 14, 2025 23:45

richardliaw added 3 commits April 15, 2025 12:09

requirements

536534c

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

update

42c6c92

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

update

047fe42

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

simon-mo merged commit 8cac35b into vllm-project:main Apr 17, 2025
24 checks passed

lionelvillard pushed a commit to lionelvillard/vllm that referenced this pull request Apr 17, 2025

[Ray] Improve documentation on batch inference (vllm-project#16609)

cc5347c

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

yangw-dev pushed a commit to yangw-dev/vllm that referenced this pull request Apr 21, 2025

[Ray] Improve documentation on batch inference (vllm-project#16609)

7ec0d7f

Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Signed-off-by: Yang Wang <elainewy@meta.com>

jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Apr 29, 2025

[Ray] Improve documentation on batch inference (vllm-project#16609)

ef187a4

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025

[Ray] Improve documentation on batch inference (vllm-project#16609)

0411539

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

adobrzyn pushed a commit to HabanaAI/vllm-fork that referenced this pull request Apr 30, 2025

[Ray] Improve documentation on batch inference (vllm-project#16609)

816440c

Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025

[Ray] Improve documentation on batch inference (vllm-project#16609)

475de0e

Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>

ckhordiasma mentioned this pull request May 14, 2025

nm vllm ent 0.8.5 sync red-hat-data-services/vllm#139

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[Ray] Improve documentation on batch inference #16609

[Ray] Improve documentation on batch inference #16609

Uh oh!

richardliaw commented Apr 14, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Uh oh!

[Ray] Improve documentation on batch inference #16609

[Ray] Improve documentation on batch inference #16609

Uh oh!

Conversation

richardliaw commented Apr 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

richardliaw commented Apr 14, 2025 •

edited by github-actions bot

Loading