Skip to content

Conversation

@JC-ut0
Copy link
Contributor

@JC-ut0 JC-ut0 commented Aug 29, 2025

What this PR does / why we need it?

Fix MTP torchair bug caused by torchair refactor and moe refactor

Depends on PRs:
fused moe fix: #2627
torchair multi DP fix: #2626

Does this PR introduce any user-facing change?

when dp is enabled, to run mtp online server, need to disable server log due to the current metrics does not support multi dp
--disable-log-stats

How was this patch tested?

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request appears to be a bugfix for Multi-Token Prediction (MTP) on the main branch. The changes involve selecting the correct MTP model implementation (TorchairDeepSeekMTP) when torchair graph is enabled, and updating the quantization logic for FusedMoE. I've found a critical issue in the FusedMoE quantization logic where a layer that should be skipped would be incorrectly quantized due to an unconditional assignment. I've provided a suggestion to fix this logical error.

@JC-ut0 JC-ut0 force-pushed the main branch 2 times, most recently from e8b9bd3 to 3f21225 Compare August 29, 2025 08:47
@MengqingCao
Copy link
Collaborator

Plz add more details in pr message to describe the specific issue this pr fixes

@JC-ut0 JC-ut0 force-pushed the main branch 2 times, most recently from d82cb35 to 88a5ede Compare August 30, 2025 07:39
@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link
Collaborator

@MengqingCao MengqingCao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@JC-ut0 JC-ut0 force-pushed the main branch 5 times, most recently from 9b85bbe to 3b66cdd Compare September 1, 2025 11:24
@codecov
Copy link

codecov bot commented Sep 1, 2025

Codecov Report

❌ Patch coverage is 83.33333% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.57%. Comparing base (600b08f) to head (ece0890).
⚠️ Report is 16 commits behind head on main.

Files with missing lines Patch % Lines
vllm_ascend/worker/mtp_proposer_v1.py 25.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2632      +/-   ##
==========================================
+ Coverage   72.61%   73.57%   +0.96%     
==========================================
  Files         147      151       +4     
  Lines       21805    21945     +140     
==========================================
+ Hits        15833    16147     +314     
+ Misses       5972     5798     -174     
Flag Coverage Δ
unittests 73.57% <83.33%> (+0.96%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: xuyexiong <xuyexiong@huawei.com>
Copy link
Collaborator

@MengqingCao MengqingCao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this first as the failed CI cases is not introduced in this pr, will fix in #2687

@MengqingCao MengqingCao merged commit 214b32a into vllm-project:main Sep 2, 2025
32 of 36 checks passed
offline893 pushed a commit to offline893/vllm-ascend that referenced this pull request Sep 16, 2025
### What this PR does / why we need it?
Fix MTP torchair bug caused by torchair refactor and moe refactor

Depends on PRs:
fused moe fix: vllm-project#2627
torchair multi DP fix:
vllm-project#2626

### Does this PR introduce _any_ user-facing change?
when dp is enabled, to run mtp online server, need to disable server log
due to the current metrics does not support multi dp
`--disable-log-stats`
### How was this patch tested?

- vLLM version: v0.10.1.1
- vLLM main:
vllm-project/vllm@7c8271c

Signed-off-by: xuyexiong <xuyexiong@huawei.com>
Signed-off-by: offline0806 <z00858301@china.huawei.com>
wangxiaoteng888 pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Sep 25, 2025
### What this PR does / why we need it?
Fix MTP torchair bug caused by torchair refactor and moe refactor

Depends on PRs:
fused moe fix: vllm-project#2627 
torchair multi DP fix:
vllm-project#2626

### Does this PR introduce _any_ user-facing change?
when dp is enabled, to run mtp online server, need to disable server log
due to the current metrics does not support multi dp
`--disable-log-stats`
### How was this patch tested?


- vLLM version: v0.10.1.1
- vLLM main:
vllm-project/vllm@7c8271c

Signed-off-by: xuyexiong <xuyexiong@huawei.com>
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Sep 26, 2025
### What this PR does / why we need it?
Fix MTP torchair bug caused by torchair refactor and moe refactor

Depends on PRs:
fused moe fix: vllm-project#2627 
torchair multi DP fix:
vllm-project#2626

### Does this PR introduce _any_ user-facing change?
when dp is enabled, to run mtp online server, need to disable server log
due to the current metrics does not support multi dp
`--disable-log-stats`
### How was this patch tested?


- vLLM version: v0.10.1.1
- vLLM main:
vllm-project/vllm@7c8271c

Signed-off-by: xuyexiong <xuyexiong@huawei.com>
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Oct 21, 2025
### What this PR does / why we need it?
Fix MTP torchair bug caused by torchair refactor and moe refactor

Depends on PRs:
fused moe fix: vllm-project#2627 
torchair multi DP fix:
vllm-project#2626

### Does this PR introduce _any_ user-facing change?
when dp is enabled, to run mtp online server, need to disable server log
due to the current metrics does not support multi dp
`--disable-log-stats`
### How was this patch tested?


- vLLM version: v0.10.1.1
- vLLM main:
vllm-project/vllm@7c8271c

Signed-off-by: xuyexiong <xuyexiong@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants