Skip to content

Conversation

@XWFAlone
Copy link
Contributor

@XWFAlone XWFAlone commented May 17, 2025

What this PR does / why we need it?

add basic v1 mtp features
please merge it after #874 and #844.

Does this PR introduce any user-facing change?

now, we supported basic v1 mtp, only supported tp only、eager mode and k=1
we will continue to expand more scenarios.

How was this patch tested?

local tested

@XWFAlone XWFAlone force-pushed the v1_mtp branch 3 times, most recently from 4a85243 to 7d3ca5a Compare May 17, 2025 10:33
@XWFAlone XWFAlone force-pushed the v1_mtp branch 2 times, most recently from 721b02d to edaf563 Compare May 23, 2025 12:34
@@ -0,0 +1,230 @@
import threading
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add patch desciption in init

# Persistent batch.
# Remove this after we drop 0.8.5 support
if vllm_version_is("0.8.5") or vllm_version_is("0.8.5.post1"):
if vllm_version_is("0.8.5") or ("0.8.5.post1"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this if statement will always be True, change this back to the last version

Suggested change
if vllm_version_is("0.8.5") or ("0.8.5.post1"):
if vllm_version_is("0.8.5") or vllm_version_is("0.8.5.post1"):

@XWFAlone XWFAlone force-pushed the v1_mtp branch 2 times, most recently from 27f8f0f to 5771f55 Compare May 27, 2025 02:05
import pytest
from vllm import LLM, SamplingParams

os.environ['VLLM_USE_MODELSCOPE'] = 'True'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if add this env, u should make a single progress in CI to avoid affecting other cases in the same progress that do not use modelscope;
You can also clear this environment variable after the script is executed. In short, make sure that this environment variable is only valid for this file.

from vllm_ascend.attention.attention_v1 import AscendAttentionState
from vllm_ascend.ops.attention import vanilla_chunked_prefill_mla
from vllm_ascend.utils import vllm_version_is
from vllm_ascend.utils import vllm_major_version_is, vllm_version_is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this version judgment function? Please explain

# Convert from (L, N, P) to (N, P, L)
self.W_UK_T = W_UK.permute(1, 2, 0).contiguous()
self.W_UV.data = torch_npu.npu_format_cast(self.W_UV.data, 29)
self.W_UK_T.data = torch_npu.npu_format_cast(self.W_UK_T.data, 29)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make this change?

@mengwei805
Copy link
Collaborator

pls rebase u all commits to 1 commit

@XWFAlone XWFAlone force-pushed the v1_mtp branch 3 times, most recently from 2a7968b to fb0db2b Compare May 28, 2025 01:54
@mengwei805 mengwei805 added long-term-test enable long term test for PR ready-for-test start test by label for PR labels May 28, 2025
@mengwei805 mengwei805 added the ready read for review label May 28, 2025
@mengwei805 mengwei805 removed the ready read for review label May 28, 2025
@wangxiyuan
Copy link
Collaborator

you can rebase now. The CI error is fixed

@wangxiyuan wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
# Persistent batch.
# Remove this after we drop 0.8.5 support
if vllm_version_is("0.8.5") or vllm_version_is("0.8.5.post1"):
if vllm_version_is("0.8.5") or ("0.8.5.post1"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is not work.

@mengwei805 mengwei805 added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@mengwei805 mengwei805 added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@mengwei805 mengwei805 added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@mengwei805 mengwei805 added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@@ -0,0 +1,92 @@
from __future__ import annotations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add this import

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoiding circular reference problems with type annotations

Co-authored-by: XWFAlone <xuewenfei2@huawei.com>
Co-authored-by: mengwei805 <mengwei25@huawei.com>
Co-authored-by: JC-ut0 <xuyexiong@huawei.com>
Signed-off-by: XWFAlone <xuewenfei2@huawei.com>
@mengwei805 mengwei805 added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@wangxiyuan wangxiyuan merged commit 3442fbd into vllm-project:main May 30, 2025
26 checks passed
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 3, 2025
### What this PR does / why we need it?
add basic v1 mtp features
please merge it after
vllm-project#874 and
vllm-project#844.

### Does this PR introduce _any_ user-facing change?
now, we supported basic v1 mtp, only supported tp only、eager mode and
k=1
we will continue to expand more scenarios.

### How was this patch tested?
local tested

Signed-off-by: XWFAlone <xuewenfei2@huawei.com>
Co-authored-by: mengwei805 <mengwei25@huawei.com>
Co-authored-by: JC-ut0 <xuyexiong@huawei.com>
Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>
David9857 pushed a commit to David9857/vllm-ascend that referenced this pull request Jun 3, 2025
### What this PR does / why we need it?
add basic v1 mtp features
please merge it after
vllm-project#874 and
vllm-project#844.

### Does this PR introduce _any_ user-facing change?
now, we supported basic v1 mtp, only supported tp only、eager mode and
k=1
we will continue to expand more scenarios.

### How was this patch tested?
local tested

Signed-off-by: XWFAlone <xuewenfei2@huawei.com>
Co-authored-by: mengwei805 <mengwei25@huawei.com>
Co-authored-by: JC-ut0 <xuyexiong@huawei.com>
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Oct 16, 2025
### What this PR does / why we need it?
add basic v1 mtp features
please merge it after
vllm-project#874 and
vllm-project#844.

### Does this PR introduce _any_ user-facing change?
now, we supported basic v1 mtp, only supported tp only、eager mode and
k=1
we will continue to expand more scenarios.

### How was this patch tested?
local tested

Signed-off-by: XWFAlone <xuewenfei2@huawei.com>
Co-authored-by: mengwei805 <mengwei25@huawei.com>
Co-authored-by: JC-ut0 <xuyexiong@huawei.com>
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Oct 21, 2025
### What this PR does / why we need it?
add basic v1 mtp features
please merge it after
vllm-project#874 and
vllm-project#844.

### Does this PR introduce _any_ user-facing change?
now, we supported basic v1 mtp, only supported tp only、eager mode and
k=1
we will continue to expand more scenarios.

### How was this patch tested?
local tested

Signed-off-by: XWFAlone <xuewenfei2@huawei.com>
Co-authored-by: mengwei805 <mengwei25@huawei.com>
Co-authored-by: JC-ut0 <xuyexiong@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

long-term-test enable long term test for PR module:ops module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants