-
Notifications
You must be signed in to change notification settings - Fork 676
feat: KVBM <-> vLLM Integration #1735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
82 commits
Select commit
Hold shift + click to select a range
37c653e
enable kvbm
ryanolson 5d4f677
Merge remote-tracking branch 'origin/main' into ryan/vllm
ryanolson 391750e
initial protocols
ryanolson 0518da0
add initial impl for KVBM cache manager allocate_slots
ziqifan617 47e6a65
merge stash
ryanolson 5eba436
handoff
ryanolson fbb68e6
wheel is building
ryanolson 0da33f1
Run clippy + fmt, Split out utils
jthomson04 3fc858e
More fmt, remove nonexistent modules, add request id field kvrequest
jthomson04 9629575
checkpoint
ryanolson 490dbab
add test_kvbm_vllm_cache_manager.py and made it work
ziqifan617 b2c8a45
rename test
ziqifan617 84ac3bb
back on track
ryanolson 5903a61
comments
ryanolson d64ef7c
checkpoint
ryanolson 0019d5f
updating api; adding tests
ryanolson 01a7e5f
hand off
ryanolson 467a9e6
list of list for get_block_ids
ryanolson aafb1de
phase 3 tests
ryanolson 8d2fda4
phase 4 testing
ryanolson 6061cec
adding doc for test plan
ryanolson 092e0c3
add has_slot method; refactor python to create a slot in allocate_slo…
ryanolson 785210c
[KVBM] add more unit tests to test kvbm py bindings (#1442)
ziqifan617 3a7e6db
adding worker side distributed object
ryanolson 66e87d0
add more kvbm unit tests (#1448)
ziqifan617 3809d31
move kv store to runtime
ryanolson 06ec03b
add missing kvbm APIs
ziqifan617 6ca2905
updates
ryanolson 12d4799
stage 1 - presumptive gtg
ryanolson 730cb10
adding missing file
ryanolson c066826
re-enable pub path() on endpoint
ryanolson b2c8e21
fix for eviction
ryanolson e41b6ec
propagate total and available block counts to pool frontends; clear i…
ryanolson 9c786a6
Merge remote-tracking branch 'origin/main' into ryan/vllm-mega
jthomson04 2202ede
free block if we cannot allocate all the requested blocks
ryanolson ca12f1b
Merge branch 'ryan/vllm-mega' of github.com:ai-dynamo/dynamo into rya…
ryanolson 92c56f4
[KVBM] add kv cache stats report to show prefix cache hit (#1505)
ziqifan617 8b99d5b
handle edge case for dropping block in allocate_slots
ryanolson 0e129aa
comment out print to avoid perf impact (#1531)
ziqifan617 d67b6ae
adding locality
ryanolson ca647dc
feat: Support new block layouts in KVBM (#1462)
jthomson04 7acc818
[KVBM] proofread rustdoc (#1586)
ziqifan617 58e623c
feat: Initial ZMQ Leader-worker integration for KVBM (#1538)
jthomson04 c2eeb20
Merge branch 'ryan/vllm-mega' of github.com:ai-dynamo/dynamo into rya…
ryanolson fe8bcd6
sqush ryan/block_v3 into commit before leader-worker pr
ryanolson 9ae754b
adding locality
ryanolson 56850b8
feat: Initial ZMQ Leader-worker integration for KVBM (#1538)
jthomson04 2030a98
checkpoint
ryanolson a39f2b7
merge conflicts from downstream ryan/vllm-mega
ryanolson 54cfd2f
merge vllm-mega in - 3 sets of tests are failing, offload, gguf and r…
ryanolson 83a231b
merge main
ryanolson c32e984
updates or leader-worker to provider worker serialize objects
ryanolson 10d8c86
fix: Fix offload tests (#1624)
jthomson04 3a436ba
adding comments for factory-level refactor
ryanolson b14027d
mark total_block_counter as expected dead_code until we enable it; we…
ryanolson b156961
adding comments and cleaning up dangling warnings
ryanolson 801183c
[KVBM] update py bindings for locality related changes
ziqifan617 e6a167c
fix
ziqifan617 e42c124
final commit from #1628
jthomson04 1bba92c
final change
ryanolson f5b3ac0
add headers
ryanolson d07159a
fix: need to expose BlockManager py binding to be used in vllm (#1643)
ziqifan617 df45bd6
delete dist folder
ziqifan617 edf6ba9
feat: Dynamic ZMQ port selection + unit test for leader-worker block …
jthomson04 df4e43d
pre-merge upstream
ryanolson ea0a992
finished upstream merge
ryanolson 7da2628
apply locality up the stack; abstractions not quite right, but gettin…
ryanolson dac17cd
test creates logical block pool with null resources
ryanolson 4e21e1a
feat: Leader-worker integration with KVBM `write_to` (#1672)
jthomson04 43033d6
Fix build
jthomson04 800aeec
Initial vLLM leader-worker stuff
jthomson04 faa74d6
feats: add get_offloaded_computed_blocks to get reusable g2 g3 blocks…
ziqifan617 75decfb
feat: Migrate to GDS MT backend (#1734)
jthomson04 ccff88c
Merge remote-tracking branch 'origin/main' into jthomson04/kvbm-vllm-…
jthomson04 4a82b0d
Remove a ton of unused stuff
jthomson04 4327aa2
Copyright, disable kvbm default feature
jthomson04 3552619
Nuke a ton of unused code
jthomson04 ca3859a
test: KVBM vLLM python tests (#1463) (#1736)
oandreeva-nv b8b2f74
Fix failing CI + automatically delete disk storage on process exit
jthomson04 34f50d2
fix: avoid onboarding last block if a full match (#1740)
ziqifan617 afa199f
Fix mypy
jthomson04 428b635
fix: not raising error when the slot is not found with the request id…
richardhuo-nv File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify the NIXL commit hash exists and is compatible.
The commit hash update ensures compatibility with the new block manager features. Please verify this commit exists in the nixl repository and contains the required functionality.
🏁 Script executed:
Length of output: 1639
🏁 Script executed:
Length of output: 367
Verify NIXL_COMMIT includes required feature code
The commit fa800bcfe3814b08df9cda9c30443de8c19665e5 exists, but it only updates CI/Jenkins configuration—no block manager or KVBM integration code is present.
• container/build.sh (line 117): NIXL_COMMIT set to fa800…
• Files changed in that commit:
Please update the hash to the NIXL commit that actually contains the block manager/KVBM feature code, or include those code changes in this PR.
🤖 Prompt for AI Agents