Skip to content

Commit 4b3a210

Browse files
authored
Implementation of simple load balance routing proxy server (#1953) (#2124)
### What this PR does / why we need it? The PR is the cherry-pick from v0.9.1 #1953 This PR introduce a new load balance proxy server example implementation for disaggregated pd, which support simple token&kv_cache aware load balance routing strategy for the disaggregated pd system compared with origin round robin toy_proxy. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? tested on real workload and unittest - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@ad57f23 --------- Signed-off-by: ganyi <pleaplusone.gy@gmail.com>
1 parent af04ee9 commit 4b3a210

File tree

2 files changed

+518
-275
lines changed

2 files changed

+518
-275
lines changed

0 commit comments

Comments
 (0)