[Bugfix] Do not crash V0 engine on input errors #13101

joerunde · 2025-02-11T17:18:34Z

This PR adds exception handling to the V0 engine that catches errors processing the input for a sequence group, and removes that sequence group from the current batch instead of crashing the entire engine. We see this a lot when processing invalid image data for multimodal models, the mllama models are particularly bad offenders.

With this change, users will receive a 400 instead of a 500 when their request fails input processing in the engine, and other running requests will continue to process instead of also receiving 500s.

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

github-actions · 2025-02-11T17:18:46Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

vllm/engine/llm_engine.py

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

vllm/engine/llm_engine.py

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

njhill

Thanks @joerunde and sorry for taking so long to properly digest/review this.

Given the state of v0 code, this looks reasonable to me :)

njhill · 2025-02-25T00:36:44Z

vllm/worker/model_runner.py

+        except Exception as e:
+            # Raise an exception that tracks the ID of the bad request
+            raise InputProcessingError(seq_group_metadata.request_id,
+                                       str(e)) from e


How about moving this try/except to the place where this method is called, here, so that it can contain that one line?

Sure, that does sound like a better idea

njhill · 2025-02-25T00:46:30Z

Probably worth merging in latest main now anyhow, which will kick off all the tests again.

joerunde · 2025-02-25T16:49:15Z

Given the state of v0 code, this looks reasonable to me :)

Hah, reading that as: "This code may be rough but it's not worse than the surroundings"

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

joerunde added 4 commits February 6, 2025 12:57

🥅 Handle input errors in engine

df0fb3f

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

✅ Add test for input processing failures

36ac644

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

🎨 cleanup

54cd257

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

🔥 cleanup abort

aeaf2eb

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

joerunde requested review from alexm-redhat, comaniac, njhill, youkaichao and zhuohan123 as code owners February 11, 2025 17:18

joerunde commented Feb 11, 2025

View reviewed changes

vllm/engine/llm_engine.py Outdated Show resolved Hide resolved

joerunde added 2 commits February 11, 2025 11:51

🎨 explicitly return empty outputs on failure

234a826

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

♻️ move input processing error def

49ad2b1

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

hmellor reviewed Feb 12, 2025

View reviewed changes

vllm/engine/llm_engine.py Outdated Show resolved Hide resolved

♻️ fix name and add comment

e90af36

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

joerunde added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 17, 2025

njhill approved these changes Feb 25, 2025

View reviewed changes

DarkLight1337 mentioned this pull request Feb 25, 2025

[Bug]: meta-llama/Llama-3.2-90B-Vision-Instruct and Qwen/Qwen2-VL-72B-Instruct models fails with asyncio.exceptions.CancelledError when using wiki image URLs #10904

Closed

1 task

DarkLight1337 enabled auto-merge (squash) February 25, 2025 08:20

DarkLight1337 disabled auto-merge February 25, 2025 08:20

joerunde added 2 commits February 25, 2025 11:49

♻️ move try/catch to be more concise

5ba9c4c

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

Merge remote-tracking branch 'upstream/main' into input-prep-handling

f7ed8c3

DarkLight1337 merged commit 3f808cc into vllm-project:main Feb 26, 2025
46 checks passed

Akshat-Tripathi pushed a commit to krai/vllm that referenced this pull request Mar 3, 2025

[Bugfix] Do not crash V0 engine on input errors (vllm-project#13101)

ddd560f

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

lulmer pushed a commit to lulmer/vllm that referenced this pull request Apr 7, 2025

[Bugfix] Do not crash V0 engine on input errors (vllm-project#13101)

3c94d04

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>

ckhordiasma mentioned this pull request Apr 17, 2025

[do not merge] pr test for nm changes into 2.20 red-hat-data-services/vllm#107

Closed

shreyankg pushed a commit to shreyankg/vllm that referenced this pull request May 3, 2025

[Bugfix] Do not crash V0 engine on input errors (vllm-project#13101)

edf26db

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Do not crash V0 engine on input errors #13101

[Bugfix] Do not crash V0 engine on input errors #13101

Uh oh!

joerunde commented Feb 11, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Feb 11, 2025

Uh oh!

Uh oh!

Uh oh!

njhill left a comment

Uh oh!

njhill Feb 25, 2025

Uh oh!

joerunde Feb 25, 2025

Uh oh!

njhill commented Feb 25, 2025

Uh oh!

joerunde commented Feb 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[Bugfix] Do not crash V0 engine on input errors #13101

[Bugfix] Do not crash V0 engine on input errors #13101

Uh oh!

Conversation

joerunde commented Feb 11, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 11, 2025

Uh oh!

Uh oh!

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

njhill Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

joerunde Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

njhill commented Feb 25, 2025

Uh oh!

joerunde commented Feb 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

joerunde commented Feb 11, 2025 •

edited by github-actions bot

Loading