Skip to content

Conversation

@lengrongfu
Copy link
Contributor

@lengrongfu lengrongfu commented Jul 7, 2025

Essential Elements of an Effective PR Description Checklist

$ vllm -m vllm.entrypoints.cli.main serve .....  --eplb-config \{\"enable_eplb\":True\}
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Test Plan

Test Result

(Optional) Documentation Update

@github-actions
Copy link

github-actions bot commented Jul 7, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @lengrongfu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors how Expert Parallelism Load Balancing (EPLB) parameters are managed and passed within the vLLM system. The core change involves introducing a dedicated EPConfig dataclass to encapsulate all EPLB-related settings, moving away from disparate individual parameters. This not only improves the internal organization of the configuration but also streamlines the command-line interface by consolidating multiple arguments into a single, JSON-configurable --ep-config option. The necessary code adaptations have been made across the project to utilize this new, more structured approach.

Highlights

  • Configuration Refactoring: All Expert Parallelism Load Balancing (EPLB) related parameters have been consolidated into a new EPConfig dataclass. This centralizes the configuration for features like enable_eplb, lb_window_size, lb_step_interval, and lb_log_balancedness.
  • Simplified CLI Interface: The multiple individual command-line arguments for EPLB parameters (e.g., --enable-eplb, --eplb-window-size) have been replaced with a single --ep-config argument. This new argument accepts a JSON string, allowing for more flexible and structured configuration directly from the CLI.
  • Codebase Adaptation: Various parts of the vLLM codebase, including ParallelConfig, EngineArgs, eplb_state.py, deepseek_v2.py, and gpu_model_runner.py, have been updated to reflect the new nested ep_config structure, ensuring seamless integration and continued functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@mergify mergify bot added deepseek Related to DeepSeek models v1 labels Jul 7, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a great step towards better code organization by refactoring the EPLB parameters into a dedicated EPConfig dataclass. The implementation is clean and consistent across the codebase. I've identified one critical issue with the programmatic initialization of ep_config that would cause a runtime failure. The fix is straightforward and included in my comment. Once that's addressed, this PR will be in excellent shape.

@simon-mo
Copy link
Collaborator

simon-mo commented Jul 7, 2025

@abmfy please review when you get the chance, thanks!

@lengrongfu lengrongfu force-pushed the feat/add-eplb-config branch 2 times, most recently from e059426 to c775cd2 Compare July 8, 2025 07:47
@lengrongfu lengrongfu changed the title [Misc] use --ep_config to set eplb param [Feature] use --ep_config to set eplb param Jul 18, 2025
@mergify
Copy link

mergify bot commented Jul 18, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @lengrongfu.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Jul 18, 2025
Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not left a full review but I have a few questions/suggestions.

The main question is why not add more fields to ParallelConfig?

@lengrongfu
Copy link
Contributor Author

I've not left a full review but I have a few questions/suggestions.

The main question is why not add more fields to ParallelConfig?

This is a good suggestion, and it would be better to implement it in ParallelConfig.

@lengrongfu lengrongfu force-pushed the feat/add-eplb-config branch from c775cd2 to b525e3f Compare July 21, 2025 09:18
@mergify mergify bot removed the needs-rebase label Jul 21, 2025
@lengrongfu
Copy link
Contributor Author

@hmellor please take a look, will done modify based on suggest.

djmmoss pushed a commit to djmmoss/vllm that referenced this pull request Aug 21, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Duncan Moss <djm.moss@gmail.com>
wangxiyuan pushed a commit to vllm-project/vllm-ascend that referenced this pull request Aug 21, 2025
### What this PR does / why we need it?
1. use action/checkout@v5 instead of v4
2. remove dbo test case because there is issue with it and will be
refactored later
3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it
4. fix sampler api changes introduced by
vllm-project/vllm#22387
6. fix qwen3 moe config changes intruoduced by
vllm-project/vllm#20562
7. fix kvcache block changes introduced by
vllm-project/vllm#23262

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@0c6e40b

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Xiao Yu <xiao.yu@amd.com>
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
mengxingkongzhouhan pushed a commit to mengxingkongzhouhan/vllm that referenced this pull request Aug 30, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
666even666 added a commit to 666even666/vllm that referenced this pull request Sep 2, 2025
Signed-off-by: Yiwen Chen <yiwen66@berkeley.edu>
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Sep 3, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@abmfy
Copy link
Member

abmfy commented Sep 4, 2025

Hi @lengrongfu, thank you again for your contribution! Just wanted to check in on the update for this doc:

Thank you for your contribution! As a final step, could you please update the corresponding documentation at expert_parallel_deployment.md? This will help users migrate smoothly to the new CLI interface.

@lengrongfu
Copy link
Contributor Author

Hi @lengrongfu, thank you again for your contribution! Just wanted to check in on the update for this doc:

Thank you for your contribution! As a final step, could you please update the corresponding documentation at expert_parallel_deployment.md? This will help users migrate smoothly to the new CLI interface.

I will commit a pr in today. thanks tips ~

wangxiaoteng888 pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Sep 25, 2025
### What this PR does / why we need it?
1. use action/checkout@v5 instead of v4
2. remove dbo test case because there is issue with it and will be
refactored later
3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it
4. fix sampler api changes introduced by
vllm-project/vllm#22387
6. fix qwen3 moe config changes intruoduced by
vllm-project/vllm#20562
7. fix kvcache block changes introduced by
vllm-project/vllm#23262

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@0c6e40b

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Sep 26, 2025
### What this PR does / why we need it?
1. use action/checkout@v5 instead of v4
2. remove dbo test case because there is issue with it and will be
refactored later
3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it
4. fix sampler api changes introduced by
vllm-project/vllm#22387
6. fix qwen3 moe config changes intruoduced by
vllm-project/vllm#20562
7. fix kvcache block changes introduced by
vllm-project/vllm#23262

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@0c6e40b

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
@lengrongfu lengrongfu deleted the feat/add-eplb-config branch October 21, 2025 02:55
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Oct 21, 2025
### What this PR does / why we need it?
1. use action/checkout@v5 instead of v4
2. remove dbo test case because there is issue with it and will be
refactored later
3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it
4. fix sampler api changes introduced by
vllm-project/vllm#22387
6. fix qwen3 moe config changes intruoduced by
vllm-project/vllm#20562
7. fix kvcache block changes introduced by
vllm-project/vllm#23262

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@0c6e40b

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend llama Related to Llama models multi-modality Related to multi-modality (#4194) performance Performance-related issues qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding structured-output v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

8 participants