-
Notifications
You must be signed in to change notification settings - Fork 677
merge with main #2252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge with main #2252
Conversation
Signed-off-by: Neal Vaidya <nealv@nvidia.com>
…utedRuntime, Namespace, Components, and Endpoint (#2008) Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com> Co-authored-by: Ryan Olson <rolson@nvidia.com>
Co-authored-by: Graham King <grahamk@nvidia.com>
Signed-off-by: Pavithra Vijayakrishnan <160681768+pvijayakrish@users.noreply.github.com>
…cheduler (#2071) Signed-off-by: Yan Ru Pei <yanrpei@gmail.com>
|
Caution Review failedFailed to post review comments. WalkthroughThis update introduces a comprehensive distributed key-value block manager (KVBM) integration for vLLM, spanning major Rust and Python codebases, Docker build infrastructure, and extensive documentation. It adds new block locality abstractions, distributed leader-worker protocols, controller interfaces, advanced block layouts, and a full Python binding and test suite for vLLM KV cache management, replacing legacy block manager APIs. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant PythonApp
participant KvbmCacheManager
participant RustBlockManager
participant DistributedLeader
participant DistributedWorker
User->>PythonApp: Issues KV cache operation (e.g., allocate, free)
PythonApp->>KvbmCacheManager: Calls cache manager API
KvbmCacheManager->>RustBlockManager: FFI call for block management
RustBlockManager->>DistributedLeader: (if leader) Initiate distributed protocol
RustBlockManager->>DistributedWorker: (if worker) Participate in protocol
DistributedLeader-->>DistributedWorker: Synchronize via ZMQ/etcd
DistributedWorker-->>RustBlockManager: Handle block transfer/ack
RustBlockManager-->>KvbmCacheManager: Return result
KvbmCacheManager-->>PythonApp: Return result
PythonApp-->>User: Operation complete
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes
Possibly related PRs
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
Overview:
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
Refactor
Chores
End-users now have access to distributed KV cache management, advanced block offloading, and vLLM integration with improved reliability, scalability, and observability.