Implementation of simple load balance routing proxy server (#1953) #2124

ganyi1996ppo · 2025-07-31T03:21:01Z

What this PR does / why we need it?

The PR is the cherry-pick from v0.9.1 #1953

This PR introduce a new load balance proxy server example implementation for disaggregated pd, which support simple token&kv_cache aware load balance routing strategy for the disaggregated pd system compared with origin round robin toy_proxy.

Does this PR introduce any user-facing change?

No

How was this patch tested?

tested on real workload and unittest

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@ad57f23

…ect#1953) This PR introduce a new proxy server implementation for disaggregated pd, which support simple token&kv_cache aware load balance routing strategy for the disaggregated pd system compared with origin toy_proxy with round robin. No tested on real workload --------- Signed-off-by: ganyi <pleaplusone.gy@gmail.com>

github-actions · 2025-07-31T03:36:04Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

wangxiyuan · 2025-07-31T07:22:36Z

does this example rely on DP patch?

ganyi1996ppo · 2025-07-31T07:25:21Z

does this example rely on DP patch?

No, this is an totally independant proxy .

wangxiyuan · 2025-07-31T07:28:17Z

OK, make sense for me.

wangxiyuan · 2025-07-31T07:29:28Z

examples/disaggregated_prefill_v1/load_balance_proxy_server_example.py

+# Adapted from https://github.com/vllm-project/vllm/tests/v1/kv_connector/nixl_integration/toy_proxy_server.py
+
+# SPDX-License-Identifier: Apache-2.0
+


it's good to add some usage guide here to let users know how to run this exmaple in quick

Good advise, I'll add some example in the comments

jianzs · 2025-07-31T07:35:24Z

Do we need both proxy server implementations in the example folder? We could keep either toy_proxy_server or the one from this PR?

ganyi1996ppo · 2025-07-31T12:51:42Z

Do we need both proxy server implementations in the example folder? We could keep either toy_proxy_server or the one from this PR?

Sounds fair, I'll remove the old one

Signed-off-by: ganyi <pleaplusone.gy@gmail.com>

…to eplb_into_main * 'main' of https://github.com/vllm-project/vllm-ascend: Implementation of simple load balance routing proxy server (vllm-project#1953) (vllm-project#2124)

…ect#1953) (vllm-project#2124) ### What this PR does / why we need it? The PR is the cherry-pick from v0.9.1 vllm-project#1953 This PR introduce a new load balance proxy server example implementation for disaggregated pd, which support simple token&kv_cache aware load balance routing strategy for the disaggregated pd system compared with origin round robin toy_proxy. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? tested on real workload and unittest - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@ad57f23 --------- Signed-off-by: ganyi <pleaplusone.gy@gmail.com>

ganyi1996ppo requested review from Yikun, jianzs and wangxiyuan July 31, 2025 03:21

wangxiyuan approved these changes Jul 31, 2025

View reviewed changes

wangxiyuan reviewed Jul 31, 2025

View reviewed changes

add comments for load balance proxy exmaple and remove toy_proxy

c325c74

Signed-off-by: ganyi <pleaplusone.gy@gmail.com>

wangxiyuan approved these changes Aug 2, 2025

View reviewed changes

wangxiyuan merged commit 4b3a210 into vllm-project:main Aug 4, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implementation of simple load balance routing proxy server (#1953) #2124

Implementation of simple load balance routing proxy server (#1953) #2124

Uh oh!

ganyi1996ppo commented Jul 31, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 31, 2025

Uh oh!

wangxiyuan commented Jul 31, 2025

Uh oh!

ganyi1996ppo commented Jul 31, 2025

Uh oh!

wangxiyuan commented Jul 31, 2025

Uh oh!

wangxiyuan Jul 31, 2025

Uh oh!

ganyi1996ppo Jul 31, 2025

Uh oh!

jianzs commented Jul 31, 2025

Uh oh!

ganyi1996ppo commented Jul 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# Adapted from https://github.com/vllm-project/vllm/tests/v1/kv_connector/nixl_integration/toy_proxy_server.py

		# SPDX-License-Identifier: Apache-2.0

Implementation of simple load balance routing proxy server (#1953) #2124

Implementation of simple load balance routing proxy server (#1953) #2124

Uh oh!

Conversation

ganyi1996ppo commented Jul 31, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Jul 31, 2025

Uh oh!

wangxiyuan commented Jul 31, 2025

Uh oh!

ganyi1996ppo commented Jul 31, 2025

Uh oh!

wangxiyuan commented Jul 31, 2025

Uh oh!

wangxiyuan Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

ganyi1996ppo Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

jianzs commented Jul 31, 2025

Uh oh!

ganyi1996ppo commented Jul 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ganyi1996ppo commented Jul 31, 2025 •

edited by github-actions bot

Loading