[P/D][V1] Add generic KV Connector for delegation to external implementations #17840

sdavidbd · 2025-05-08T08:22:59Z

vLLM recently introduced the KVConnectorBase_V1 API (#15960) which provides a standardized interface for integrating external KV cache offload, sharing, and transfer solutions - such as those used in prefill-decode (P-D) disaggregation. A concrete example of this is the LMCache connector (#16625).

To avoid duplicating boilerplate code across multiple vendor-specific connectors that primarily delegate to external packages, this PR adds a GenericKVConnector. This connector delegates all KVConnectorBase_V1 abstract methods to an external implementation provided at runtime. This approach enables vendors to encapsulate their custom logic in their own packages without requiring changes to the vLLM codebase.

For example, the current LMCache integration can switch from using a dedicated connector class to GenericKVConnector as follows:

KVTransferConfig(
    kv_connector="GenericKVConnector",
    kv_role="kv_both",
    kv_connector_extra_config={
        GenericKVConnector.ExtraConfigKeys.CONNECTOR_IMPL_MODULE_PATH:
            "lmcache.integration.vllm.vllm_v1_adapter",
        GenericKVConnector.ExtraConfigKeys.CONNECTOR_IMPL_CLASS_NAME:
            "LMCacheConnectorV1Impl",
         
        # Additional LMCache-specific config entries can be added here
        #"lmcache_config_key": "value",
    },
)

This design simplifies integration while keeping the core vLLM codebase clean and extensible.

github-actions · 2025-05-08T08:23:07Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

njhill · 2025-05-08T18:09:43Z

@sdavidbd I don't really follow what this buys you. If there is an external implementation of the V1 connector API then you can use it directly. If the external library doesn't implement the vllm connector API directly then an custom adapter implementation is needed regardless?

sdavidbd · 2025-05-08T20:14:32Z

@sdavidbd I don't really follow what this buys you. If there is an external implementation of the V1 connector API then you can use it directly. If the external library doesn't implement the vllm connector API directly then an custom adapter implementation is needed regardless?

Thanks Nick! Using an external implementation of KVConnectorBase_V1 still requires upstreaming a concrete class into the vLLM codebase - one that simply wraps and delegates to the actual logic in the external package (e.g., like the current LMCacheConnectorV1).

The goal of GenericKVConnector is to avoid this duplication by providing a reusable delegating connector that dynamically loads and invokes an implementation defined in the external library. This allows vendors to ship their connector logic entirely within their own packages, enabling users to integrate with vLLM without needing to upstream or maintain a dedicated wrapper in the vLLM repo.

njhill · 2025-05-11T22:41:04Z

Thanks @sdavidbd. But I still don't see why this is needed. You can just call the below before starting vllm and then reference it directly in the passed KVTransferConfig.

KVConnectorFactory.register_connector(
    "MyConnector",
    "myconnector.module.path",
    "MyConnectorClass")

da-x · 2025-05-12T05:33:12Z

@njhill I think the main issue is getting the call to KVConnectorFactory.register_connector happen in the first place, in a standalone vllm execution. Suppose you are using vLLM as a library, then you are right, but otherwise @sdavidbd's change helps.

For standalone vllm execution, you may wind up writing wrappers such this:

#!/usr/bin/env python

import re
import sys

from vllm.entrypoints.cli.main import main
# Just import this so it gets registered as a plugin
from vua.vllm.kv_connector_v1 import VUAStorageConnector_V1

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(main())

And not as a setuptools entrypoint, otherwise it breaks — I've attempted other ways, and in each of them the problem was that mutlithreading usage in vllm/engine/multiprocessing was re-executing vllm's main module entrypoint without doing the import that would register the KV connector.

mergify · 2025-05-12T16:48:12Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @sdavidbd.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

liuzijing2014 · 2025-05-12T19:47:39Z

@sdavidbd @da-x @njhill We are seeing a similar need that we have an external implementation of kv connector (which is vendor internal implementation that we can't pull up a PR to merge it).

KVConnectorFactory.register_connector(
    "MyConnector",
    "myconnector.module.path",
    "MyConnectorClass")

This wouldn't work if we are using TP > 1 because different GPU workers would run on their own processes.

Another proposal: do you guys think we can just pass in kv connector cls as a parameter to the core engine just like how we do with model executor?

vllm/vllm/v1/engine/async_llm.py

Lines 118 to 122 in 289199f

    
           self.engine_core = core_client_class( 
        
               vllm_config=vllm_config, 
        
               executor_class=executor_class, 
        
               log_stats=self.log_stats, 
        
           )

njhill · 2025-05-12T19:54:57Z

Thanks @da-x @liuzijing2014, I see what you mean. Yes I'm still not sure a wrapper implementation is necessarily the way to solve this... hopefully we can follow a similar plugin mechanism that's used for other things like @liuzijing2014 said. See https://docs.vllm.ai/en/latest/design/plugin_system.html

Delegates all abstract methods to a dynamically loaded external implementation Signed-off-by: David Ben-David <davidb@pliops.com>

liuzijing2014 · 2025-05-12T22:32:31Z

KVConnectorFactory.register_connector(
    "MyConnector",
    "myconnector.module.path",
    "MyConnectorClass")

#18028 @da-x @njhill @sdavidbd let us know what do you think. It should be a more lightweight approach.

sdavidbd · 2025-05-14T14:02:47Z

@njhill @da-x @liuzijing2014 Thanks everyone for the helpful and thoughtful feedback — much appreciated!

Indeed, the need for a delegating connector like GenericKVConnector felt a bit awkward in retrospect. Your comments helped me realize that requiring any explicit registration step is actually redundant. A cleaner approach is to dynamically load the external KV connector implementation directly from config, without needing to register or wrap it at all.

This avoids the need to upstream boilerplate delegating modules for each vendor implementation, and keeps the vLLM codebase cleaner and more extensible.

I've opened a new PR that takes this improved direction: #18142

Looking forward to your thoughts on the updated approach!

mergify · 2025-05-14T23:29:48Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @sdavidbd.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

sdavidbd · 2025-05-18T06:26:31Z

Closing in favor of #18142

sdavidbd changed the title ~~Add GenericKVConnector to delegate KVConnectorBase_V1 API~~ [P/D][V1] Add generic KV Connector for delegation to external implementations May 8, 2025

mgoin requested a review from robertgshaw2-redhat May 8, 2025 16:20

mergify bot added the needs-rebase label May 12, 2025

Add GenericKVConnector to delegate KVConnectorBase_V1 API

b3fb374

Delegates all abstract methods to a dynamically loaded external implementation Signed-off-by: David Ben-David <davidb@pliops.com>

sdavidbd force-pushed the feature/generic-kv-connector branch from 84204d5 to b3fb374 Compare May 12, 2025 21:08

mergify bot removed the needs-rebase label May 12, 2025

sdavidbd mentioned this pull request May 14, 2025

[P/D][v1] Allow registering external kv connector from args #18028

Closed

mergify bot added the needs-rebase label May 14, 2025

sdavidbd closed this May 18, 2025

Uh oh!

[P/D][V1] Add generic KV Connector for delegation to external implementations #17840

[P/D][V1] Add generic KV Connector for delegation to external implementations #17840

Uh oh!

Conversation

sdavidbd commented May 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 8, 2025

Uh oh!

njhill commented May 8, 2025

Uh oh!

sdavidbd commented May 8, 2025

Uh oh!

njhill commented May 11, 2025

Uh oh!

da-x commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented May 12, 2025

Uh oh!

liuzijing2014 commented May 12, 2025

Uh oh!

njhill commented May 12, 2025

Uh oh!

liuzijing2014 commented May 12, 2025

Uh oh!

sdavidbd commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented May 14, 2025

Uh oh!

sdavidbd commented May 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sdavidbd commented May 8, 2025 •

edited by github-actions bot

Loading

da-x commented May 12, 2025 •

edited

Loading

sdavidbd commented May 14, 2025 •

edited

Loading