Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
143 commits
Select commit Hold shift + click to select a range
37c653e
enable kvbm
ryanolson Jun 3, 2025
5d4f677
Merge remote-tracking branch 'origin/main' into ryan/vllm
ryanolson Jun 3, 2025
391750e
initial protocols
ryanolson Jun 3, 2025
0518da0
add initial impl for KVBM cache manager allocate_slots
ziqifan617 Jun 3, 2025
47e6a65
merge stash
ryanolson Jun 4, 2025
5eba436
handoff
ryanolson Jun 4, 2025
fbb68e6
wheel is building
ryanolson Jun 4, 2025
0da33f1
Run clippy + fmt, Split out utils
jthomson04 Jun 4, 2025
3fc858e
More fmt, remove nonexistent modules, add request id field kvrequest
jthomson04 Jun 4, 2025
9629575
checkpoint
ryanolson Jun 4, 2025
490dbab
add test_kvbm_vllm_cache_manager.py and made it work
ziqifan617 Jun 4, 2025
b2c8a45
rename test
ziqifan617 Jun 4, 2025
84ac3bb
back on track
ryanolson Jun 5, 2025
5903a61
comments
ryanolson Jun 5, 2025
d64ef7c
checkpoint
ryanolson Jun 7, 2025
0019d5f
updating api; adding tests
ryanolson Jun 9, 2025
01a7e5f
hand off
ryanolson Jun 9, 2025
467a9e6
list of list for get_block_ids
ryanolson Jun 9, 2025
aafb1de
phase 3 tests
ryanolson Jun 9, 2025
8d2fda4
phase 4 testing
ryanolson Jun 9, 2025
6061cec
adding doc for test plan
ryanolson Jun 9, 2025
092e0c3
add has_slot method; refactor python to create a slot in allocate_slo…
ryanolson Jun 9, 2025
785210c
[KVBM] add more unit tests to test kvbm py bindings (#1442)
ziqifan617 Jun 9, 2025
3a7e6db
adding worker side distributed object
ryanolson Jun 10, 2025
66e87d0
add more kvbm unit tests (#1448)
ziqifan617 Jun 10, 2025
3809d31
move kv store to runtime
ryanolson Jun 10, 2025
06ec03b
add missing kvbm APIs
ziqifan617 Jun 11, 2025
6ca2905
updates
ryanolson Jun 11, 2025
12d4799
stage 1 - presumptive gtg
ryanolson Jun 11, 2025
730cb10
adding missing file
ryanolson Jun 11, 2025
c066826
re-enable pub path() on endpoint
ryanolson Jun 11, 2025
b2c8e21
fix for eviction
ryanolson Jun 12, 2025
e41b6ec
propagate total and available block counts to pool frontends; clear i…
ryanolson Jun 12, 2025
9c786a6
Merge remote-tracking branch 'origin/main' into ryan/vllm-mega
jthomson04 Jun 12, 2025
2202ede
free block if we cannot allocate all the requested blocks
ryanolson Jun 12, 2025
ca12f1b
Merge branch 'ryan/vllm-mega' of github.com:ai-dynamo/dynamo into rya…
ryanolson Jun 12, 2025
92c56f4
[KVBM] add kv cache stats report to show prefix cache hit (#1505)
ziqifan617 Jun 12, 2025
8b99d5b
handle edge case for dropping block in allocate_slots
ryanolson Jun 13, 2025
0e129aa
comment out print to avoid perf impact (#1531)
ziqifan617 Jun 14, 2025
d67b6ae
adding locality
ryanolson Jun 16, 2025
ca647dc
feat: Support new block layouts in KVBM (#1462)
jthomson04 Jun 16, 2025
7acc818
[KVBM] proofread rustdoc (#1586)
ziqifan617 Jun 18, 2025
58e623c
feat: Initial ZMQ Leader-worker integration for KVBM (#1538)
jthomson04 Jun 18, 2025
c2eeb20
Merge branch 'ryan/vllm-mega' of github.com:ai-dynamo/dynamo into rya…
ryanolson Jun 18, 2025
fe8bcd6
sqush ryan/block_v3 into commit before leader-worker pr
ryanolson Jun 18, 2025
9ae754b
adding locality
ryanolson Jun 16, 2025
56850b8
feat: Initial ZMQ Leader-worker integration for KVBM (#1538)
jthomson04 Jun 18, 2025
2030a98
checkpoint
ryanolson Jun 21, 2025
a39f2b7
merge conflicts from downstream ryan/vllm-mega
ryanolson Jun 24, 2025
54cfd2f
merge vllm-mega in - 3 sets of tests are failing, offload, gguf and r…
ryanolson Jun 24, 2025
83a231b
merge main
ryanolson Jun 24, 2025
c32e984
updates or leader-worker to provider worker serialize objects
ryanolson Jun 24, 2025
10d8c86
fix: Fix offload tests (#1624)
jthomson04 Jun 24, 2025
3a436ba
adding comments for factory-level refactor
ryanolson Jun 25, 2025
b14027d
mark total_block_counter as expected dead_code until we enable it; we…
ryanolson Jun 25, 2025
b156961
adding comments and cleaning up dangling warnings
ryanolson Jun 25, 2025
801183c
[KVBM] update py bindings for locality related changes
ziqifan617 Jun 24, 2025
e6a167c
fix
ziqifan617 Jun 24, 2025
e42c124
final commit from #1628
jthomson04 Jun 25, 2025
1bba92c
final change
ryanolson Jun 25, 2025
f5b3ac0
add headers
ryanolson Jun 25, 2025
d07159a
fix: need to expose BlockManager py binding to be used in vllm (#1643)
ziqifan617 Jun 25, 2025
df45bd6
delete dist folder
ziqifan617 Jun 25, 2025
edf6ba9
feat: Dynamic ZMQ port selection + unit test for leader-worker block …
jthomson04 Jun 26, 2025
df4e43d
pre-merge upstream
ryanolson Jun 26, 2025
ea0a992
finished upstream merge
ryanolson Jun 26, 2025
7da2628
apply locality up the stack; abstractions not quite right, but gettin…
ryanolson Jun 26, 2025
dac17cd
test creates logical block pool with null resources
ryanolson Jun 26, 2025
4e21e1a
feat: Leader-worker integration with KVBM `write_to` (#1672)
jthomson04 Jun 27, 2025
43033d6
Fix build
jthomson04 Jun 30, 2025
800aeec
Initial vLLM leader-worker stuff
jthomson04 Jun 30, 2025
faa74d6
feats: add get_offloaded_computed_blocks to get reusable g2 g3 blocks…
ziqifan617 Jul 2, 2025
75decfb
feat: Migrate to GDS MT backend (#1734)
jthomson04 Jul 2, 2025
ccff88c
Merge remote-tracking branch 'origin/main' into jthomson04/kvbm-vllm-…
jthomson04 Jul 2, 2025
4a82b0d
Remove a ton of unused stuff
jthomson04 Jul 2, 2025
4327aa2
Copyright, disable kvbm default feature
jthomson04 Jul 2, 2025
3552619
Nuke a ton of unused code
jthomson04 Jul 2, 2025
ca3859a
test: KVBM vLLM python tests (#1463) (#1736)
oandreeva-nv Jul 2, 2025
b8b2f74
Fix failing CI + automatically delete disk storage on process exit
jthomson04 Jul 2, 2025
34f50d2
fix: avoid onboarding last block if a full match (#1740)
ziqifan617 Jul 2, 2025
afa199f
Fix mypy
jthomson04 Jul 3, 2025
428b635
fix: not raising error when the slot is not found with the request id…
richardhuo-nv Jul 4, 2025
23c7c80
More logging, fix onboard strategy, fix bugs, add python tests for ge…
jthomson04 Jul 3, 2025
39d41f8
Fix num_new_tokens AssertionError
jthomson04 Jul 3, 2025
2cbf8f5
adding debug for torch tensors
ryanolson Jul 4, 2025
d254e7b
Fix issue with chunked prefill + decode tokens
jthomson04 Jul 4, 2025
f690b24
Fix tests to account for not matching last block
jthomson04 Jul 4, 2025
85f4bce
Fix usage bug which caused underflows
jthomson04 Jul 5, 2025
4cef826
Fix perf issues, update prefix hit rate to account for hits in connec…
jthomson04 Jul 7, 2025
83c0c19
Little refactor
jthomson04 Jul 8, 2025
eeefffc
fix(block-manager): Allows the immutable block to hold a duplicate as…
ryanolson Jul 8, 2025
08ddd30
Fix todos, fix run script for --use-nixl-gds without --mount-workspace
jthomson04 Jul 8, 2025
81ef848
feat: add a new dockerfile to build kvbm container | add KVBM guide (…
ziqifan617 Jul 8, 2025
9f87e10
num blocks override for host and disk
jthomson04 Jul 8, 2025
d08c6a5
Override disk cache dir
jthomson04 Jul 9, 2025
9a3e9db
add more logging
ryanolson Jul 9, 2025
6785c5f
Fix onboard issue for sequence lengths divisible by block size
jthomson04 Jul 9, 2025
eb3057c
Fix tests
jthomson04 Jul 9, 2025
de3fc94
Update run_kvbm_in_vllm.md
ziqifan617 Jul 10, 2025
81f56f1
Update run_kvbm_in_vllm.md
ziqifan617 Jul 10, 2025
904c387
wired up reset to vllm http; more tracing (#1847)
ryanolson Jul 10, 2025
8e4e13a
fix: Fix intermittent disk accuracy issues (#1896)
jthomson04 Jul 12, 2025
5217589
fix: KVBM graceful exit on initialization failure (#1902)
jthomson04 Jul 12, 2025
4dacb08
Little refactor
jthomson04 Jul 14, 2025
b74205d
feat: Fix duplicate KVBM onboard blocks (#1950)
jthomson04 Jul 16, 2025
bd1b8b3
fix: add an env var to setup the leader-worker heartbeat timeout (#1…
richardhuo-nv Jul 16, 2025
e7a5ca6
fix: handling slot not found in get_block_ids (#1977)
oandreeva-nv Jul 16, 2025
eb34041
Update run_kvbm_in_vllm.md
ziqifan617 Jul 17, 2025
2b7e42b
feat: Metrics for KVBM transfer bw (#1989)
jthomson04 Jul 18, 2025
b7b15b2
formatting + precommit
jthomson04 Jul 18, 2025
d13249e
feat: Restructure KVBM init to broadcast num device blocks to leader …
jthomson04 Jul 19, 2025
9d75cc1
feat: KVBM Block Pool Trait (#2019)
jthomson04 Jul 21, 2025
fe51f4d
Merge remote-tracking branch 'origin/main' into jthomson04/kvbm-vllm-…
jthomson04 Jul 22, 2025
58abe36
Fixes for new vLLM
jthomson04 Jul 18, 2025
3fb033a
feat: kvbm controller + client (#2059)
ryanolson Jul 22, 2025
4eec03b
feat: log info stats per request
ryanolson Jul 23, 2025
0cd2289
dropping osl from logging as we do not have that information correct …
ryanolson Jul 23, 2025
8c5bd2a
initial stub out
ryanolson Jul 23, 2025
c78c4c9
more stubbing out
ryanolson Jul 23, 2025
5353972
wired up
ryanolson Jul 25, 2025
c4c2a67
no-op connector api
ryanolson Jul 25, 2025
295fdf9
bringing in worker bits
ryanolson Jul 25, 2025
1bc9205
adding trigger to kvbm worker
ryanolson Jul 25, 2025
d35aaf2
Moving all logic to dynamo
oandreeva-nv Jul 25, 2025
5c01eb4
Leader / Worker integration from John's branch + small fixes (#2126)
oandreeva-nv Jul 25, 2025
c8031c5
disabling device pool on BlockManager (#2129)
oandreeva-nv Jul 26, 2025
2d073bd
building abstacting connector logic behind a trait
ryanolson Jul 26, 2025
fc63b9c
improved debugging
ryanolson Jul 26, 2025
95448ec
leader and worker mostly wired up; next: link worker engine to transf…
ryanolson Jul 27, 2025
45802de
move connector traits to llm::block_manager
ryanolson Jul 27, 2025
1b4c799
adding connector::protocols for coordinating leader <-> xfer engine <…
ryanolson Jul 27, 2025
b2191d8
updating connector: leader <-> xfer <-> scheduler <-> worker
ryanolson Jul 27, 2025
c5e959d
monday start
ryanolson Jul 28, 2025
66d1fd5
expose block transfer handler on kvbm worker
ryanolson Jul 28, 2025
0fa15bb
establing more of the runtime purely in rust
ryanolson Jul 28, 2025
382ebf5
leader initialized load - aka immediate ready to test
ryanolson Jul 29, 2025
f7c3a1d
onboarding functional - some cleanup is needed
ryanolson Jul 29, 2025
f3a82e5
7/30 - am sync point (#2193)
ryanolson Jul 30, 2025
ca3cc61
adding tests for scheduler <--> transfer task coordination (#2197)
ryanolson Jul 31, 2025
e64314a
merge ziqi + ryan
ryanolson Jul 31, 2025
a16a8b8
olga - trigger offlload on worker (#2211)
oandreeva-nv Jul 31, 2025
2a24c79
working branch for 7/31 (#2219)
ryanolson Aug 1, 2025
1a78b06
connector/250801 (#2244)
ryanolson Aug 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 0 additions & 4 deletions .devcontainer/post-create.sh
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,6 @@ export CARGO_TARGET_DIR=$HOME/dynamo/.build/target
cargo build --locked --profile dev --features mistralrs
cargo doc --no-deps

# create symlinks for the binaries in the deploy directory
mkdir -p $HOME/dynamo/deploy/sdk/src/dynamo/sdk/cli/bin
ln -sf $HOME/dynamo/.build/target/debug/dynamo-run $HOME/dynamo/deploy/sdk/src/dynamo/sdk/cli/bin/dynamo-run

# install the python bindings
cd $HOME/dynamo/lib/bindings/python && retry maturin develop

Expand Down
13 changes: 10 additions & 3 deletions .github/workflows/trigger_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,9 @@ jobs:
filters: |
vllm:
- 'container/Dockerfile.vllm'
- 'container/Dockerfile.vllm_nixl'
- 'examples/python/llm/**'
- 'examples/python_rs/llm/**'
- 'container/deps/requirements.vllm.txt'
- 'container/deps/vllm/**'
- 'components/backends/vllm/**'
- 'tests/serve/test_vllm.py'
trtllm:
- 'container/Dockerfile.tensorrt_llm'
Expand All @@ -62,6 +60,11 @@ jobs:
- 'tests/serve/test_trtllm.py'
sdk:
- 'deploy/**'
sglang:
- 'container/Dockerfile.sglang'
- 'container/Dockerfile.sglang-deepep'
- 'components/backends/sglang/**'
- 'container/build.sh'
- name: Check if Validation Workflow has run
id: check_workflow
uses: actions/github-script@v6
Expand Down Expand Up @@ -104,6 +107,10 @@ jobs:
ci_variables["RUN_TENSORRTLLM"]="true"
fi

if [ "${{ steps.src_changes.outputs.sglang }}" == "true" ]; then
ci_variables["RUN_SGLANG"]="true"
fi

if [ "${{ steps.src_changes.outputs.sdk }}" == "true" ]; then
ci_variables["RUN_SDK_CI"]="true"
fi
Expand Down
4 changes: 1 addition & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,4 @@ generated-values.yaml
TensorRT-LLM

# Local build artifacts for devcontainer
.build/
# Copied binaries to ignore
deploy/sdk/src/dynamo/sdk/cli/bin
.build/
37,470 changes: 9,663 additions & 27,807 deletions ATTRIBUTIONS-Go.md

Large diffs are not rendered by default.

Loading
Loading