-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[KVConnector] Migrate the LMCache integration code to be vLLM native #25542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KVConnector] Migrate the LMCache integration code to be vLLM native #25542
Conversation
Signed-off-by: ApostaC <yihua98@uchicago.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a significant step in improving the maintainability of the LMCache integration by vendoring its code into the vLLM repository. The changes are well-structured, primarily involving the addition of the LMCache integration code under a new directory and updating the connector to use this native implementation. I've identified a critical bug and another high-severity issue that should be addressed.
vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/vllm_v1_adapter.py
Outdated
Show resolved
Hide resolved
vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/utils.py
Show resolved
Hide resolved
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: ApostaC <yihua98@uchicago.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!!!
Signed-off-by: ApostaC <yihua98@uchicago.edu>
|
@codex review |
vllm/distributed/kv_transfer/kv_connector/v1/lmcache_connector.py
Outdated
Show resolved
Hide resolved
vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/vllm_v1_adapter.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting
vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/vllm_v1_adapter.py
Outdated
Show resolved
Hide resolved
| from lmcache.v1.offload_server.zmq_server import ZMQOffloadServer | ||
| from lmcache.v1.plugin.plugin_launcher import PluginLauncher | ||
|
|
||
| # Third Party |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix the import order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Just to double check, the correct import order should be something like:
- python native imports (os, sys, time, typing...)
- third party libs (torch, zmq)
- first party imports with absolute paths (from vllm.xxx import)
- type checking related imports
Pls correct me if I'm wrong.
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: ApostaC <yihua98@uchicago.edu>
|
@simon-mo Sorry, I was travelling in the past 2 weeks. |
|
Hey @NickLucche , would you like to take a quick look at this PR? Seems like Simon is pretty busy these days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test?
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: ApostaC <yihua98@uchicago.edu>
…llm-project#25542) Signed-off-by: ApostaC <yihua98@uchicago.edu>
…llm-project#25542) Signed-off-by: ApostaC <yihua98@uchicago.edu>
…llm-project#25542) Signed-off-by: ApostaC <yihua98@uchicago.edu> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…llm-project#25542) Signed-off-by: ApostaC <yihua98@uchicago.edu> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Purpose
This PR puts the LMCache integration code (that does
from vllm import ***) into vLLM.The goal of this PR (and the following series) is to make sure that LMCache has a correct integration with vLLM, and the integration codes are "maintainable" by the whole vLLM community.
Test Plan
There will be a few upcoming PRs covering the testing and documentation:
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.