Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
37c653e
enable kvbm
ryanolson Jun 3, 2025
5d4f677
Merge remote-tracking branch 'origin/main' into ryan/vllm
ryanolson Jun 3, 2025
391750e
initial protocols
ryanolson Jun 3, 2025
0518da0
add initial impl for KVBM cache manager allocate_slots
ziqifan617 Jun 3, 2025
47e6a65
merge stash
ryanolson Jun 4, 2025
5eba436
handoff
ryanolson Jun 4, 2025
fbb68e6
wheel is building
ryanolson Jun 4, 2025
0da33f1
Run clippy + fmt, Split out utils
jthomson04 Jun 4, 2025
3fc858e
More fmt, remove nonexistent modules, add request id field kvrequest
jthomson04 Jun 4, 2025
9629575
checkpoint
ryanolson Jun 4, 2025
490dbab
add test_kvbm_vllm_cache_manager.py and made it work
ziqifan617 Jun 4, 2025
b2c8a45
rename test
ziqifan617 Jun 4, 2025
84ac3bb
back on track
ryanolson Jun 5, 2025
5903a61
comments
ryanolson Jun 5, 2025
d64ef7c
checkpoint
ryanolson Jun 7, 2025
0019d5f
updating api; adding tests
ryanolson Jun 9, 2025
01a7e5f
hand off
ryanolson Jun 9, 2025
467a9e6
list of list for get_block_ids
ryanolson Jun 9, 2025
aafb1de
phase 3 tests
ryanolson Jun 9, 2025
8d2fda4
phase 4 testing
ryanolson Jun 9, 2025
6061cec
adding doc for test plan
ryanolson Jun 9, 2025
092e0c3
add has_slot method; refactor python to create a slot in allocate_slo…
ryanolson Jun 9, 2025
785210c
[KVBM] add more unit tests to test kvbm py bindings (#1442)
ziqifan617 Jun 9, 2025
3a7e6db
adding worker side distributed object
ryanolson Jun 10, 2025
66e87d0
add more kvbm unit tests (#1448)
ziqifan617 Jun 10, 2025
3809d31
move kv store to runtime
ryanolson Jun 10, 2025
06ec03b
add missing kvbm APIs
ziqifan617 Jun 11, 2025
6ca2905
updates
ryanolson Jun 11, 2025
12d4799
stage 1 - presumptive gtg
ryanolson Jun 11, 2025
730cb10
adding missing file
ryanolson Jun 11, 2025
c066826
re-enable pub path() on endpoint
ryanolson Jun 11, 2025
b2c8e21
fix for eviction
ryanolson Jun 12, 2025
e41b6ec
propagate total and available block counts to pool frontends; clear i…
ryanolson Jun 12, 2025
9c786a6
Merge remote-tracking branch 'origin/main' into ryan/vllm-mega
jthomson04 Jun 12, 2025
2202ede
free block if we cannot allocate all the requested blocks
ryanolson Jun 12, 2025
ca12f1b
Merge branch 'ryan/vllm-mega' of github.com:ai-dynamo/dynamo into rya…
ryanolson Jun 12, 2025
92c56f4
[KVBM] add kv cache stats report to show prefix cache hit (#1505)
ziqifan617 Jun 12, 2025
8b99d5b
handle edge case for dropping block in allocate_slots
ryanolson Jun 13, 2025
0e129aa
comment out print to avoid perf impact (#1531)
ziqifan617 Jun 14, 2025
d67b6ae
adding locality
ryanolson Jun 16, 2025
ca647dc
feat: Support new block layouts in KVBM (#1462)
jthomson04 Jun 16, 2025
7acc818
[KVBM] proofread rustdoc (#1586)
ziqifan617 Jun 18, 2025
58e623c
feat: Initial ZMQ Leader-worker integration for KVBM (#1538)
jthomson04 Jun 18, 2025
c2eeb20
Merge branch 'ryan/vllm-mega' of github.com:ai-dynamo/dynamo into rya…
ryanolson Jun 18, 2025
fe8bcd6
sqush ryan/block_v3 into commit before leader-worker pr
ryanolson Jun 18, 2025
9ae754b
adding locality
ryanolson Jun 16, 2025
56850b8
feat: Initial ZMQ Leader-worker integration for KVBM (#1538)
jthomson04 Jun 18, 2025
2030a98
checkpoint
ryanolson Jun 21, 2025
a39f2b7
merge conflicts from downstream ryan/vllm-mega
ryanolson Jun 24, 2025
54cfd2f
merge vllm-mega in - 3 sets of tests are failing, offload, gguf and r…
ryanolson Jun 24, 2025
83a231b
merge main
ryanolson Jun 24, 2025
c32e984
updates or leader-worker to provider worker serialize objects
ryanolson Jun 24, 2025
10d8c86
fix: Fix offload tests (#1624)
jthomson04 Jun 24, 2025
3a436ba
adding comments for factory-level refactor
ryanolson Jun 25, 2025
b14027d
mark total_block_counter as expected dead_code until we enable it; we…
ryanolson Jun 25, 2025
b156961
adding comments and cleaning up dangling warnings
ryanolson Jun 25, 2025
801183c
[KVBM] update py bindings for locality related changes
ziqifan617 Jun 24, 2025
e6a167c
fix
ziqifan617 Jun 24, 2025
e42c124
final commit from #1628
jthomson04 Jun 25, 2025
1bba92c
final change
ryanolson Jun 25, 2025
f5b3ac0
add headers
ryanolson Jun 25, 2025
d07159a
fix: need to expose BlockManager py binding to be used in vllm (#1643)
ziqifan617 Jun 25, 2025
df45bd6
delete dist folder
ziqifan617 Jun 25, 2025
edf6ba9
feat: Dynamic ZMQ port selection + unit test for leader-worker block …
jthomson04 Jun 26, 2025
df4e43d
pre-merge upstream
ryanolson Jun 26, 2025
ea0a992
finished upstream merge
ryanolson Jun 26, 2025
7da2628
apply locality up the stack; abstractions not quite right, but gettin…
ryanolson Jun 26, 2025
dac17cd
test creates logical block pool with null resources
ryanolson Jun 26, 2025
4e21e1a
feat: Leader-worker integration with KVBM `write_to` (#1672)
jthomson04 Jun 27, 2025
43033d6
Fix build
jthomson04 Jun 30, 2025
800aeec
Initial vLLM leader-worker stuff
jthomson04 Jun 30, 2025
faa74d6
feats: add get_offloaded_computed_blocks to get reusable g2 g3 blocks…
ziqifan617 Jul 2, 2025
75decfb
feat: Migrate to GDS MT backend (#1734)
jthomson04 Jul 2, 2025
ccff88c
Merge remote-tracking branch 'origin/main' into jthomson04/kvbm-vllm-…
jthomson04 Jul 2, 2025
4a82b0d
Remove a ton of unused stuff
jthomson04 Jul 2, 2025
4327aa2
Copyright, disable kvbm default feature
jthomson04 Jul 2, 2025
3552619
Nuke a ton of unused code
jthomson04 Jul 2, 2025
ca3859a
test: KVBM vLLM python tests (#1463) (#1736)
oandreeva-nv Jul 2, 2025
b8b2f74
Fix failing CI + automatically delete disk storage on process exit
jthomson04 Jul 2, 2025
34f50d2
fix: avoid onboarding last block if a full match (#1740)
ziqifan617 Jul 2, 2025
afa199f
Fix mypy
jthomson04 Jul 3, 2025
428b635
fix: not raising error when the slot is not found with the request id…
richardhuo-nv Jul 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 18 additions & 3 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion container/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ SGLANG_BASE_IMAGE_TAG="25.01-cuda12.8-devel-ubuntu24.04"
VLLM_V1_BASE_IMAGE="nvcr.io/nvidia/cuda-dl-base"
VLLM_V1_BASE_IMAGE_TAG="25.01-cuda12.8-devel-ubuntu24.04"

NIXL_COMMIT=16348080f5bdeb9fe6058a23be140cec020ef3f3
NIXL_COMMIT=fa800bcfe3814b08df9cda9c30443de8c19665e5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify the NIXL commit hash exists and is compatible.

The commit hash update ensures compatibility with the new block manager features. Please verify this commit exists in the nixl repository and contains the required functionality.


🏁 Script executed:

#!/bin/bash
# Description: Verify the NIXL commit exists in the repository
# Expected: The commit should exist and be accessible

NIXL_COMMIT="fa800bcfe3814b08df9cda9c30443de8c19665e5"

# Check if we can access the commit via GitHub API
curl -s "https://api.github.com/repos/ai-dynamo/nixl/commits/${NIXL_COMMIT}" | jq '.sha, .commit.message' || echo "Failed to fetch commit info"

Length of output: 1639


🏁 Script executed:

#!/bin/bash
# Description: List files modified in the specified NIXL commit to verify block manager changes
NIXL_COMMIT="fa800bcfe3814b08df9cda9c30443de8c19665e5"
curl -s -H "Accept: application/vnd.github.v3+json" \
  "https://api.github.com/repos/ai-dynamo/nixl/commits/${NIXL_COMMIT}" | jq -r '.files[]?.filename'

Length of output: 367


Verify NIXL_COMMIT includes required feature code

The commit fa800bcfe3814b08df9cda9c30443de8c19665e5 exists, but it only updates CI/Jenkins configuration—no block manager or KVBM integration code is present.

• container/build.sh (line 117): NIXL_COMMIT set to fa800…
• Files changed in that commit:

  • .ci/assets/nixlbench-version-info.json.template
  • .ci/jenkins/lib/nixlbench-container-build-matrix.yaml
  • .ci/jenkins/pipeline/proj-jjb.yaml

Please update the hash to the NIXL commit that actually contains the block manager/KVBM feature code, or include those code changes in this PR.

🤖 Prompt for AI Agents
In container/build.sh at line 117, the NIXL_COMMIT is set to a commit that only
updates CI/Jenkins configuration and does not include the block manager or KVBM
integration code. Update the NIXL_COMMIT variable to point to the commit hash
that contains the actual block manager/KVBM feature code, ensuring the build
uses the correct code changes.

NIXL_REPO=ai-dynamo/nixl.git

NIXL_UCX_EFA_REF=7ec95b95e524a87e81cac92f5ca8523e3966b16b
Expand Down
4 changes: 4 additions & 0 deletions dynamo.code-workspace
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@
}
],
"settings": {
"python.analysis.extraPaths": [
"dynamo/lib/bindings/python/src",
"vllm/vllm",
],
"rust-analyzer.linkedProjects": [
"Cargo.toml",
"launch/dynamo-run/Cargo.toml",
Expand Down
55 changes: 54 additions & 1 deletion lib/bindings/python/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions lib/bindings/python/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ anyhow = { version = "1" }
async-openai = { version = "0.29.0" }
async-stream = { version = "0.3" }
async-trait = { version = "0.1" }
derive-getters = "0.5"
futures = { version = "0.3" }
once_cell = { version = "1.20.3" }
serde = { version = "1" }
Expand Down Expand Up @@ -77,3 +78,5 @@ pythonize = "0.23"

dlpark = { version = "0.5", features = ["pyo3", "half"], optional = true }

[dev-dependencies]
rstest = "0.25"
4 changes: 3 additions & 1 deletion lib/bindings/python/rust/llm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,11 @@
use super::*;

pub mod backend;
pub mod block_manager;
pub mod disagg_router;
pub mod kv;
pub mod model_card;
pub mod nats;
pub mod preprocessor;

#[cfg(feature = "block-manager")]
pub mod block_manager;
Loading