Skip to content

Conversation

@flesher0813
Copy link
Contributor

@flesher0813 flesher0813 commented Oct 17, 2025

Purpose

What this PR does / why we need it?

Apply the merged pr vllm-project/vllm#19330 to support load failure

Modifications

Does this PR introduce any user-facing change?

Adapt to the vllm main code and support load failure handling.

Test

How was this patch tested?

Tested with offline script, manually construct normal load/load failure/preemption scenarios.
image
Compare kvcache they loaded. The preemption load same kv cache. When loading failed, it would generate outputs without uc connector.
image

@flesher0813 flesher0813 changed the title [WIP][Fix] Fix gpu_model_runner req_state update error for issue 283 [Fix] Fix gpu_model_runner req_state update error for issue 283 Oct 20, 2025
Signed-off-by: flesher0813 <1208954694@qq.com>
@flesher0813 flesher0813 merged commit d2f3d9a into ModelEngine-Group:develop Oct 21, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants