[Feature] use --eplb_config to set eplb param #20562

lengrongfu · 2025-07-07T10:08:46Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". [Feature] Expert Parallelism Load Balancer (EPLB) #18343 (comment)
The test plan, such as providing test command.

$ vllm -m vllm.entrypoints.cli.main serve .....  --eplb-config \{\"enable_eplb\":True\}

The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Test Plan

Test Result

(Optional) Documentation Update

github-actions · 2025-07-07T10:09:03Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Summary of Changes

Hello @lengrongfu, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors how Expert Parallelism Load Balancing (EPLB) parameters are managed and passed within the vLLM system. The core change involves introducing a dedicated EPConfig dataclass to encapsulate all EPLB-related settings, moving away from disparate individual parameters. This not only improves the internal organization of the configuration but also streamlines the command-line interface by consolidating multiple arguments into a single, JSON-configurable --ep-config option. The necessary code adaptations have been made across the project to utilize this new, more structured approach.

Highlights

Configuration Refactoring: All Expert Parallelism Load Balancing (EPLB) related parameters have been consolidated into a new EPConfig dataclass. This centralizes the configuration for features like enable_eplb, lb_window_size, lb_step_interval, and lb_log_balancedness.
Simplified CLI Interface: The multiple individual command-line arguments for EPLB parameters (e.g., --enable-eplb, --eplb-window-size) have been replaced with a single --ep-config argument. This new argument accepts a JSON string, allowing for more flexible and structured configuration directly from the CLI.
Codebase Adaptation: Various parts of the vLLM codebase, including ParallelConfig, EngineArgs, eplb_state.py, deepseek_v2.py, and gpu_model_runner.py, have been updated to reflect the new nested ep_config structure, ensuring seamless integration and continued functionality.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request is a great step towards better code organization by refactoring the EPLB parameters into a dedicated EPConfig dataclass. The implementation is clean and consistent across the codebase. I've identified one critical issue with the programmatic initialization of ep_config that would cause a runtime failure. The fix is straightforward and included in my comment. Once that's addressed, this PR will be in excellent shape.

vllm/engine/arg_utils.py

simon-mo · 2025-07-07T17:15:18Z

@abmfy please review when you get the chance, thanks!

mergify · 2025-07-18T16:25:14Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @lengrongfu.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

hmellor

I've not left a full review but I have a few questions/suggestions.

The main question is why not add more fields to ParallelConfig?

vllm/config.py

lengrongfu · 2025-07-21T08:53:09Z

I've not left a full review but I have a few questions/suggestions.

The main question is why not add more fields to ParallelConfig?

This is a good suggestion, and it would be better to implement it in ParallelConfig.

lengrongfu · 2025-07-21T09:21:14Z

@hmellor please take a look, will done modify based on suggest.

vllm/engine/arg_utils.py

vllm/config/parallel.py

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: rongfu.leng <lenronfu@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Duncan Moss <djm.moss@gmail.com>

### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: rongfu.leng <lenronfu@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: rongfu.leng <lenronfu@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: rongfu.leng <lenronfu@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Signed-off-by: Yiwen Chen <yiwen66@berkeley.edu>

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: rongfu.leng <lenronfu@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

abmfy · 2025-09-04T01:00:39Z

Hi @lengrongfu, thank you again for your contribution! Just wanted to check in on the update for this doc:

Thank you for your contribution! As a final step, could you please update the corresponding documentation at expert_parallel_deployment.md? This will help users migrate smoothly to the new CLI interface.

lengrongfu · 2025-09-04T02:40:48Z

Hi @lengrongfu, thank you again for your contribution! Just wanted to check in on the update for this doc:

Thank you for your contribution! As a final step, could you please update the corresponding documentation at expert_parallel_deployment.md? This will help users migrate smoothly to the new CLI interface.

I will commit a pr in today. thanks tips ~

### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: rongfu.leng <lenronfu@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>

gemini-code-assist bot reviewed Jul 7, 2025

View reviewed changes

mergify bot added deepseek Related to DeepSeek models v1 labels Jul 7, 2025

gemini-code-assist bot reviewed Jul 7, 2025

View reviewed changes

vllm/engine/arg_utils.py Outdated Show resolved Hide resolved

lengrongfu force-pushed the feat/add-eplb-config branch from a4b836b to ee1bca0 Compare July 7, 2025 16:06

lengrongfu marked this pull request as ready for review July 7, 2025 16:06

lengrongfu requested review from WoosukKwon, alexm-redhat, comaniac, hmellor, houseroad, mgoin, njhill, robertgshaw2-redhat, simon-mo, tlrmchlsmth, youkaichao and ywang96 as code owners July 7, 2025 16:06

lengrongfu force-pushed the feat/add-eplb-config branch 2 times, most recently from e059426 to c775cd2 Compare July 8, 2025 07:47

lengrongfu changed the title ~~[Misc] use --ep_config to set eplb param~~ [Feature] use --ep_config to set eplb param Jul 18, 2025

mergify bot added the needs-rebase label Jul 18, 2025

hmellor reviewed Jul 18, 2025

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

vllm/config.py Outdated Show resolved Hide resolved

lengrongfu force-pushed the feat/add-eplb-config branch from c775cd2 to b525e3f Compare July 21, 2025 09:18

mergify bot removed the needs-rebase label Jul 21, 2025

github-project-automation bot moved this to Done in Structured Output Aug 20, 2025

hmellor reviewed Aug 20, 2025

View reviewed changes

vllm/engine/arg_utils.py Show resolved Hide resolved

hmellor reviewed Aug 20, 2025

View reviewed changes

vllm/config/parallel.py Show resolved Hide resolved

MengqingCao mentioned this pull request Aug 21, 2025

[CI] fix ci vllm-project/vllm-ascend#2464

Merged

666even666 added a commit to 666even666/vllm that referenced this pull request Sep 2, 2025

update to be compatible with vllm-project#20562

666c0b3

Signed-off-by: Yiwen Chen <yiwen66@berkeley.edu>

lengrongfu mentioned this pull request Sep 4, 2025

[Docs]add eplb_config param use docs #24213

Merged

5 tasks

lengrongfu deleted the feat/add-eplb-config branch October 21, 2025 02:55

Uh oh!

[Feature] use --eplb_config to set eplb param #20562

[Feature] use --eplb_config to set eplb param #20562

Uh oh!

Conversation

lengrongfu commented Jul 7, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

github-actions bot commented Jul 7, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

simon-mo commented Jul 7, 2025

Uh oh!

mergify bot commented Jul 18, 2025

Uh oh!

hmellor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lengrongfu commented Jul 21, 2025

Uh oh!

lengrongfu commented Jul 21, 2025

Uh oh!

Uh oh!

Uh oh!

abmfy commented Sep 4, 2025

Uh oh!

lengrongfu commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

lengrongfu commented Jul 7, 2025 •

edited by github-actions bot

Loading