[V1] Exception Handling when Loading KV Cache from Remote Store #21534

liuyumoye · 2025-07-24T14:02:40Z

Purpose

This PR implements an exception handling mechanism for cases where loading the KV cache from remote storage fails.

The key point of this scheme is that the kv cache blocks are allocated in advance to load the remote kv store, wait for the load to complete, and then schedule the request to execute according to the actual number of loaded tokens.

We reused the process and some interfaces of the Nixl connector to minimize additional modifications to vLLM. Naturally, this mechanism is also applicable to the Nixl Connector, requiring only minimal adaptation.

Test Plan

In folder tests/v1/kv_connector/kv_load_exception_handling, I have written a mock async_offload_connector that simulates the scenario where the actual number of KV tokens loaded is less than the prompt length.

Test Result

You can use the following command to start a vLLM service that includes the mock connector and you can send any request to this service

cd tests/v1/kv_connector/kv_load_exception_handling
bash test.sh

gemini-code-assist

Code Review

This pull request introduces an exception handling mechanism for failures when loading the KV cache from a remote store. The approach involves pre-allocating KV cache blocks, attempting to load from the remote store, and then scheduling requests based on the number of tokens that were successfully loaded.

The changes are well-structured, primarily involving plumbing a new finish_loading_dict from the worker to the scheduler to communicate the load status. The core logic in the scheduler and model runner appears sound. My review has identified a couple of issues in the new mock connector (async_offload_connector.py) used for testing, which include type mismatches and an incorrect return type. Addressing these will improve the correctness and robustness of the test suite.

tests/v1/kv_connector/kv_load_exception_handling/async_offload_connector.py

github-actions · 2025-07-24T14:23:18Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

mergify · 2025-07-25T02:25:30Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @liuyumoye.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

ApostaC

I like this PR! It looks clean and the changes are small. Just left a few quick comment.
Also, fault tolerance is a very important topic, so thanks for contributing!

ApostaC · 2025-07-25T16:58:11Z

vllm/v1/core/sched/scheduler.py

nit: add a comment here to briefly describe what the keys and values are

ApostaC · 2025-07-25T17:01:07Z

vllm/v1/worker/kv_connector_model_runner_mixin.py

I like the design that it doesn't need to modify any of the connector API. But a minor concern is that we didn't have any docstring describing get_finish_loading's expected behavior.

One proposal is add this function into KV connector's base class and give it a dummy implementation like return {}. In this case, we can also skip hasattr check here

Thanks for the detailed feedback! I agree that adding a docstring for get_finished_loading would make the behavior clearer, and moving it to the base class with a default implementation is a great idea to simplify the code. Appreciate your suggestions! 😊

KuntaiDu · 2025-07-25T21:32:25Z

tests/v1/kv_connector/kv_load_exception_handling/async_offload_connector.py

Maybe change the name to "RandomDropConnector" or some other descriptive name and docstring that this is for fault tolerance testing?

Great suggestion! I've renamed the class to RandomDropConnector and updated the docstring to explicitly mention that it's designed for fault tolerance testing. This should make the purpose of the class much clearer. Thanks for pointing this out! 😊

KuntaiDu · 2025-07-25T21:33:23Z

vllm/v1/core/sched/scheduler.py

liuyumoye · 2025-07-26T05:46:34Z

Thanks for these suggestions! I've added some comments and moving get_finished_loading to the base class. Let me know if there's anything else that needs improvement! 😊

I like this PR! It looks clean and the changes are small. Just left a few quick comment. Also, fault tolerance is a very important topic, so thanks for contributing!

KuntaiDu

LGTM!

Signed-off-by: liuyumoye <adeline_ly2023@outlook.com>

sdavidbd · 2025-07-28T12:24:36Z

Hi @liuyumoye, @ApostaC, @KuntaiDu,

Please note that this PR was merged without any reference to the ongoing work in #19330, which has been open for several weeks and discussed extensively with both community members and core maintainers. Given that some of the contributors and reviewers here were aware of that effort, it would have been helpful to coordinate or at least cross-reference to ensure alignment.

This PR appears to overlap significantly with the goals and scope of #19330, which introduces KV load failure recovery — a feature that has been carefully designed, reviewed, and extended with a clear roadmap.

Moreover, this PR addresses only asynchronous KV loading, lacks test coverage, and moves us further away from the planned block-based connector API. In contrast, #19330 provides support for both sync and async KV loading, and tensor/pipeline parallelism, along with a more comprehensive and validated test setup.

To avoid duplication, ensure correctness, and converge on a unified and robust solution, I suggest reverting this PR and continuing the work within the scope of #19330. That PR was ready to merge until just a few hours ago and can move forward immediately once this is reverted.

Tagging @njhill @youkaichao @robertgshaw2-redhat @NickLucche @orozery for visibility. I’d appreciate your thoughts on how we can align and move forward collaboratively.

sdavidbd · 2025-07-28T11:12:23Z

tests/v1/kv_connector/kv_load_exception_handling/test.sh

This example only runs an online serving instance without specifying input or expected output. While it can verify that requests with load failures complete, it does not validate the correctness of the output.

yeah this looks unnecessary, it should probably belong in examples or even better just in the docs

sdavidbd · 2025-07-28T11:23:29Z

vllm/distributed/kv_transfer/kv_connector/utils.py

                    finished_set.add(req_id)
                    del remaining_count_dict[req_id]

+        def update_finished_load_dict(worker_finished_loading_dict: dict[str,


This aggregation logic assumes that all outputs arrive in the same step. If that’s not the case, it may prematurely propagate the first worker’s output back to the scheduler. In particular, it doesn’t appear to support pipeline-parallel setups.

sdavidbd · 2025-07-28T11:33:34Z

vllm/v1/core/sched/scheduler.py

+        if num_actual_load_tokens <= 0 and hasattr(self.connector,
+                                                   "add_failure_request"):
+            self.connector.add_failure_request(request)
+            return True


In some cases, when a request is rescheduled with 'num_computed_tokens = 0', a local cache hit may occur, which can lead to a failure during 'save_new_computed_blocks' (see: https://github.com/vllm-project/vllm/blob/2bb7358d467e00b3a0472516d32ef99cc75d456f/vllm/v1/core/sched/scheduler.py#L1013).

sdavidbd · 2025-07-28T11:39:14Z

vllm/v1/core/sched/scheduler.py

        (block_ids, ) = self.kv_cache_manager.get_block_ids(request.request_id)
        return self.connector.request_finished(request, block_ids)

+    def _update_actual_load_token_num_from_remote_kv(self,


This method should likely also handle removing the request from finished_recving_kv_req_ids

sdavidbd · 2025-07-28T11:41:48Z

vllm/v1/core/sched/scheduler.py

+        if num_actual_load_tokens <= 0 and hasattr(self.connector,
+                                                   "add_failure_request"):
+            self.connector.add_failure_request(request)


This code assumes the presence of a custom method (add_failure_request) that is not part of the official KVConnectorBase_V1 interface. Calling it from core logic breaks abstraction and risks incompatibility. If this functionality is necessary, it should be formalized as part of the KVConnectorBase_V1 interface.

Arguably, the add_failure_request method is not necessary in the core interface. Failure handling can be delegated to a custom connector implementation that wraps the actual connector and provides the desired behavior, keeping the core KVConnectorBase_V1 interface clean and minimal.

NickLucche · 2025-07-28T12:45:00Z

I think we just lacked some communication and we had fewer eyes on it over the weekend.
I am also in favor of reverting to a clean state and address all concerns that may have been risen from this work over to #19330 .
We can also probably re-use the unit test we have here as an additional case.
Regardless, thanks a lot for the work done in this PR.

liuyumoye · 2025-07-28T15:51:36Z

Hi @liuyumoye, @ApostaC, @KuntaiDu,

Please note that this PR was merged without any reference to the ongoing work in #19330, which has been open for several weeks and discussed extensively with both community members and core maintainers. Given that some of the contributors and reviewers here were aware of that effort, it would have been helpful to coordinate or at least cross-reference to ensure alignment.

This PR appears to overlap significantly with the goals and scope of #19330, which introduces KV load failure recovery — a feature that has been carefully designed, reviewed, and extended with a clear roadmap.

Moreover, this PR addresses only asynchronous KV loading, lacks test coverage, and moves us further away from the planned block-based connector API. In contrast, #19330 provides support for both sync and async KV loading, and tensor/pipeline parallelism, along with a more comprehensive and validated test setup.

To avoid duplication, ensure correctness, and converge on a unified and robust solution, I suggest reverting this PR and continuing the work within the scope of #19330. That PR was ready to merge until just a few hours ago and can move forward immediately once this is reverted.

Tagging @njhill @youkaichao @robertgshaw2-redhat @NickLucche @orozery for visibility. I’d appreciate your thoughts on how we can align and move forward collaboratively.

Hi @sdavidbd @NickLucche,

Thank you for the reminder. I apologize for not being fully aware of the PR merging guidelines. I’ll make sure to reference any ongoing work in future PRs and follow the proper process moving forward. I appreciate your understanding and will be more mindful of this in the future.

The primary issue we’re currently focused on is exception handling in asynchronous loading scenarios. As technology evolves, we anticipate that asynchronously loading the KV Cache from remote storage will become a common operation for connectors. Therefore, our goal is to handle exceptions for these asynchronous loading scenarios.

When comparing different approaches, we found that while the block-based solution offers more comprehensive coverage, it is also significantly more complex to implement. In practice, KV loading exception are already rare, and the scenario where an intermediate block fails to load while subsequent blocks succeed within a continuous request is even less common (especially given that eviction policies like LRU are typically in place). Additionally, to fully leverage non-contiguous blocks, further adjustments to the attention metadata would be required to enable segmented execution of flash attention. However, whether these changes would yield significant performance improvements remains to be validated.

In contrast, the request-based solution is more lightweight and requires fewer modifications to vLLM’s existing scheduling logic. Rather than overcomplicating the design, perhaps the most suitable solution is the best one.

Tagging @youkaichao @KuntaiDu @ApostaC for visibility. I’d love to hear your input on how we can align our efforts and work together to move this forward.

…re (#21534)" This reverts commit 15a72ac. Signed-off-by: KuntaiDu <kuntai@uchicago.edu>

sdavidbd · 2025-07-28T17:46:59Z

Hi @liuyumoye — thanks for your response, I appreciate your openness to align going forward.

Regarding the block-based failure recovery approach in #19330, I wanted to briefly highlight some of the key motivations behind that direction — beyond enabling future reuse of non-contiguous blocks (which I believe is both feasible and increasingly important, especially for RAG-style and cache-similarity-based use cases).

There are also immediate benefits:

Connector visibility: The connector naturally sees which blocks fail but typically lacks the context to map failures back to specific requests. Avoiding per-request tracking keeps connector implementations simpler and more decoupled.
One-to-many dependency: A single failed block can affect multiple requests within a batch. Reporting failed blocks allows the scheduler — which already maintains request-to-block mappings — to identify and reschedule any affected request.
Granularity of recovery: In many cases, only a small subset of a request’s blocks may fail. Discarding the entire request would be wasteful, especially if most of the data has been successfully loaded. Block-level reporting enables fine-grained recovery.

The longer-term goals are also quite tangible:
We're actively working on enhancing KV reuse beyond prefix matching, and this design directly supports that direction. It also aligns with other community-driven improvements like the block-based CPU offloading connector (see #19854).

If you haven’t had a chance yet, I’d recommend catching up on the discussions in #19330 — many of these tradeoffs were discussed there in more depth.

Looking forward to aligning and building a unified solution!

lk-chen · 2025-07-28T20:49:32Z

tests/v1/kv_connector/kv_load_exception_handling/random_drop_connector.py

+        num_actual_load_tokens = random.randint(0, len(hit_tokens))
+        return num_actual_load_tokens
+
+    def get_finished_loading(self) -> dict[str, int]:


it has different interface as KVConnectorBase_V1

it has different interface as KVConnectorBase_V1

Thanks for reminding, the subsequent we intend to build a RandomDropStorageConnector for actual load and discarding the tokens in #22075

liuyumoye · 2025-07-29T09:05:09Z

Hi @liuyumoye — thanks for your response, I appreciate your openness to align going forward.

Regarding the block-based failure recovery approach in #19330, I wanted to briefly highlight some of the key motivations behind that direction — beyond enabling future reuse of non-contiguous blocks (which I believe is both feasible and increasingly important, especially for RAG-style and cache-similarity-based use cases).

There are also immediate benefits:

Connector visibility: The connector naturally sees which blocks fail but typically lacks the context to map failures back to specific requests. Avoiding per-request tracking keeps connector implementations simpler and more decoupled.

One-to-many dependency: A single failed block can affect multiple requests within a batch. Reporting failed blocks allows the scheduler — which already maintains request-to-block mappings — to identify and reschedule any affected request.

Granularity of recovery: In many cases, only a small subset of a request’s blocks may fail. Discarding the entire request would be wasteful, especially if most of the data has been successfully loaded. Block-level reporting enables fine-grained recovery.

The longer-term goals are also quite tangible: We're actively working on enhancing KV reuse beyond prefix matching, and this design directly supports that direction. It also aligns with other community-driven improvements like the block-based CPU offloading connector (see #19854).

If you haven’t had a chance yet, I’d recommend catching up on the discussions in #19330 — many of these tradeoffs were discussed there in more depth.

Looking forward to aligning and building a unified solution!

Hi @sdavidbd - Thank you for explaining the highlights of the block-based strategy.

I have a few questions and would like to discuss them further:

In the scenario of synchronous execution with disaggregated prefill, If a direct layerwise transfer approach is adopted between the Prefill instance and the Decode instance, the current rescheduling method might result in incorrect KV cache being passed to Decode.
In the asynchronous scenario, we believe that the situation where "reading from remote storage fails, and only a small subset of a request’s blocks may fail" is relatively rare, or it might be related to the eviction strategy of the KV cache storage. However, I noticed that you consider this a common occurrence. Could you elaborate on these scenarios? Perhaps the contexts we’re dealing with differ.

Looking forward to collaborating on a unified solution that addresses these challenges effectively!

sdavidbd · 2025-07-29T14:29:36Z

I have a few questions and would like to discuss them further:

In the scenario of synchronous execution with disaggregated prefill, If a direct layerwise transfer approach is adopted between the Prefill instance and the Decode instance, the current rescheduling method might result in incorrect KV cache being passed to Decode.

In the asynchronous scenario, we believe that the situation where "reading from remote storage fails, and only a small subset of a request’s blocks may fail" is relatively rare, or it might be related to the eviction strategy of the KV cache storage. However, I noticed that you consider this a common occurrence. Could you elaborate on these scenarios? Perhaps the contexts we’re dealing with differ.

Looking forward to collaborating on a unified solution that addresses these challenges effectively!

On the "direct layerwise transfer" setup — if you're referring to streaming KV blocks from Prefill instance to Decode instance as they're computed, then yes, there’s a risk of invalid data being transferred when computation depends on externally cached blocks that failed to load. The proposed recovery mechanism addresses this by rescheduling those blocks for recomputation. As for retransferring, I think the transfer connector should handle it. If it decides which blocks to send based on num_computed_tokens, then once the request is updated, the necessary blocks should be re-sent automatically.
I agree that loading errors should generally be rare. My point was that when failures do happen — for example, due to a temporary network issue — they can impact any block, not necessarily an entire request. This can also happen with eviction strategies or tiered storage. The block-based approach helps by tracking failures at a more granular level, making it possible to recover more efficiently without discarding unaffected parts.

Let me know if you see any gaps or have a different take on these cases.

liuyumoye · 2025-08-01T12:22:00Z

I have a few questions and would like to discuss them further:

In the scenario of synchronous execution with disaggregated prefill, If a direct layerwise transfer approach is adopted between the Prefill instance and the Decode instance, the current rescheduling method might result in incorrect KV cache being passed to Decode.

In the asynchronous scenario, we believe that the situation where "reading from remote storage fails, and only a small subset of a request’s blocks may fail" is relatively rare, or it might be related to the eviction strategy of the KV cache storage. However, I noticed that you consider this a common occurrence. Could you elaborate on these scenarios? Perhaps the contexts we’re dealing with differ.

Looking forward to collaborating on a unified solution that addresses these challenges effectively!

On the "direct layerwise transfer" setup — if you're referring to streaming KV blocks from Prefill instance to Decode instance as they're computed, then yes, there’s a risk of invalid data being transferred when computation depends on externally cached blocks that failed to load. The proposed recovery mechanism addresses this by rescheduling those blocks for recomputation. As for retransferring, I think the transfer connector should handle it. If it decides which blocks to send based on num_computed_tokens, then once the request is updated, the necessary blocks should be re-sent automatically.

I agree that loading errors should generally be rare. My point was that when failures do happen — for example, due to a temporary network issue — they can impact any block, not necessarily an entire request. This can also happen with eviction strategies or tiered storage. The block-based approach helps by tracking failures at a more granular level, making it possible to recover more efficiently without discarding unaffected parts.

Let me know if you see any gaps or have a different take on these cases.

I'm glad we agree that such scenarios are rare, so I believe a lightweight solution should be adopted to handle these errors in order to avoid introducing too many modifications to the vLLM system.
Additionally, thank you for your review of this PR—your feedback has been very helpful. I’ve made improvements based on your suggestions and have submitted a new PR #22075. Please feel free to share any further thoughts!

…-project#21534) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com>

…-project#21534) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com> Signed-off-by: x22x22 <wadeking@qq.com>

…-project#21534) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com>

…-project#21534) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>

…-project#21534) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>

…-project#21534) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com> Signed-off-by: Diego-Castan <diego.castan@ibm.com>

…-project#21534) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com>

liuyumoye requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners July 24, 2025 14:02

mergify bot added the v1 label Jul 24, 2025

gemini-code-assist bot reviewed Jul 24, 2025

View reviewed changes

liuyumoye force-pushed the exception_handling branch from b3862a0 to dfa2dbc Compare July 25, 2025 02:24

mergify bot added the needs-rebase label Jul 25, 2025

liuyumoye force-pushed the exception_handling branch from dfa2dbc to 286057f Compare July 25, 2025 03:00

mergify bot removed the needs-rebase label Jul 25, 2025

liuyumoye force-pushed the exception_handling branch 7 times, most recently from ebf30e2 to 609b7c9 Compare July 25, 2025 11:33

ApostaC reviewed Jul 25, 2025

View reviewed changes

KuntaiDu reviewed Jul 25, 2025

View reviewed changes

liuyumoye force-pushed the exception_handling branch from 609b7c9 to 01659a7 Compare July 26, 2025 05:23

liuyumoye force-pushed the exception_handling branch from 01659a7 to d41ca56 Compare July 26, 2025 06:05

KuntaiDu approved these changes Jul 26, 2025

View reviewed changes

KuntaiDu enabled auto-merge (squash) July 26, 2025 21:22

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 26, 2025

[V1] Exception Handling when Loading KV Cache from Remote Store

cc2c852

Signed-off-by: liuyumoye <adeline_ly2023@outlook.com>

KuntaiDu enabled auto-merge (squash) July 28, 2025 02:21

vllm-bot merged commit 15a72ac into vllm-project:main Jul 28, 2025
68 of 70 checks passed

sdavidbd reviewed Jul 28, 2025

View reviewed changes

KuntaiDu mentioned this pull request Jul 28, 2025

Revert "[V1] Exception Handling when Loading KV Cache from Remote Store" #21778

Merged

KuntaiDu added a commit that referenced this pull request Jul 28, 2025

Revert "[V1] Exception Handling when Loading KV Cache from Remote Sto…

db93e20

…re (#21534)" This reverts commit 15a72ac. Signed-off-by: KuntaiDu <kuntai@uchicago.edu>

lk-chen reviewed Jul 28, 2025

View reviewed changes

lk-chen mentioned this pull request Jul 28, 2025

[P/D] Update output of get_finished to newly defined class KVConnectorFinishOutput #21790

Closed

4 tasks

sdavidbd mentioned this pull request Jul 31, 2025

[V1] [P/D] Add Support for KV Load Failure Recovery #19330

Merged

liuyumoye mentioned this pull request Aug 1, 2025

[V1] Enhanced Exception Handling for KV Cache Loading from Remote Store #22075

Draft

Uh oh!

[V1] Exception Handling when Loading KV Cache from Remote Store #21534

[V1] Exception Handling when Loading KV Cache from Remote Store #21534

Conversation

liuyumoye commented Jul 24, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jul 24, 2025

Uh oh!

mergify bot commented Jul 25, 2025

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liuyumoye commented Jul 26, 2025

Uh oh!

KuntaiDu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sdavidbd commented Jul 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NickLucche commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liuyumoye commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sdavidbd commented Jul 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liuyumoye commented Jul 29, 2025

Uh oh!

sdavidbd commented Jul 29, 2025

Uh oh!

liuyumoye commented Aug 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

liuyumoye commented Jul 24, 2025 •

edited by github-actions bot

Loading

NickLucche commented Jul 28, 2025 •

edited

Loading

liuyumoye commented Jul 28, 2025 •

edited

Loading