Skip to content

Conversation

@shen-shanshan
Copy link
Collaborator

@shen-shanshan shen-shanshan commented Aug 4, 2025

What this PR does / why we need it?

Remove redundant imported envs, using envs_ascend instead.

import vllm.envs as envs_vllm
import vllm_ascend.envs as envs_ascend

Does this PR introduce any user-facing change?

How was this patch tested?

@github-actions
Copy link

github-actions bot commented Aug 4, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@ApsarasX
Copy link
Collaborator

ApsarasX commented Aug 4, 2025

There are also many vllm_ascend envs in other code files. I suggest replacing all of them.

For example

  • envs.VLLM_ENABLE_FUSED_EXPERTS_ALLGATHER_EP in w8a8_dynamic.py
  • envs.VLLM_ASCEND_ENABLE_MATMUL_ALLREDUCE in patch_linear.py
    .....

@shen-shanshan
Copy link
Collaborator Author

There are also many vllm_ascend envs in other code files. I suggest replacing all of them.

For example

  • envs.VLLM_ENABLE_FUSED_EXPERTS_ALLGATHER_EP in w8a8_dynamic.py
  • envs.VLLM_ASCEND_ENABLE_MATMUL_ALLREDUCE in patch_linear.py
    .....

@ApsarasX Done.

@codecov
Copy link

codecov bot commented Aug 8, 2025

Codecov Report

❌ Patch coverage is 84.21053% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.74%. Comparing base (992271b) to head (3a96ee4).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
..._ascend/distributed/llmdatadist_c_mgr_connector.py 50.00% 2 Missing ⚠️
vllm_ascend/utils.py 75.00% 2 Missing ⚠️
...d/patch/platform/patch_common/patch_distributed.py 50.00% 1 Missing ⚠️
vllm_ascend/quantization/w8a8_dynamic.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2193   +/-   ##
=======================================
  Coverage   75.74%   75.74%           
=======================================
  Files         118      118           
  Lines       13525    13525           
=======================================
  Hits        10245    10245           
  Misses       3280     3280           
Flag Coverage Δ
unittests 75.74% <84.21%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@shen-shanshan
Copy link
Collaborator Author

@ApsarasX The CI has all passed. Does this can be merged?

@ApsarasX
Copy link
Collaborator

@ApsarasX The CI has all passed. Does this can be merged?

OK

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
@wangxiyuan wangxiyuan merged commit 103654c into vllm-project:main Aug 14, 2025
25 checks passed
zhoux77899 added a commit to zhoux77899/vllm-ascend that referenced this pull request Aug 14, 2025
… MoE layers (#3)

* feat(performance): support `GroupedMatmulSwigluQuant` in `W8A8_DYNAMIC` quantized MoE layers

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(bug): fix bug

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* feat(ops): enable grouped_matmul_swiglu_quant by default

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(test): fix broken test

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(test): temporally skip broken test due to oom

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(test): change bias1 to tensor

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(bug): update group_list handling and weight scale in dynamic methods

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* feat(ops): replace all splited gmm and swiglu

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* feat(quantization): split w4a8 and w8a8 apply

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(test): replace w8a8 function in apply

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* feat(cumsum): add cumsum_group_list function for group list processing

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* [Doc] Add container image save/load FAQ for offline environments (vllm-project#2347)

### What this PR does / why we need it?

Add Docker export/import guide for air-gapped environments

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

NA

- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@d16aa3d

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

* [Bugfix] fix the oom when chunkprefill with long context like 64k (vllm-project#2319)

The attn mask was declared in the mla.py,we don't need the splitfuse
mask when mla chunkprefill, and this mask will cause memory problem when
long context like 64k or 128k

- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@14a5d90

---------

Signed-off-by: haojiangzheng <justineric096@gmail.com>

* [Quickfix] Add the missing `apply_router_weight_on_input` in FusedMoE init (vllm-project#2348)

### What this PR does / why we need it?
Add the missing `apply_router_weight_on_input` in FusedMoE init
Quick fix on
vllm-project#2268 (comment)

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@6807af8

Signed-off-by: MengqingCao <cmq0113@163.com>

* [2/N][Refactor] Refactor V1 attention for better extensibility (vllm-project#1995)

### What this PR does / why we need it?

Refactor V1 Attention for better extensibility (prepared for torchair
attention refactor).

**Main changes:**
- Move different kinds of foward into their method respectively, e.g.,
`_forward_prefill_no_cache()`, `_forward_prefill_cache_hit()`,
`_forward_decode_only()`, `_forward_v1_style()`.

### Does this PR introduce _any_ user-facing change?

No.

- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@14a5d90

Signed-off-by: shen-shanshan <467638484@qq.com>

* [Misc] Remove redundant imported `envs`, using `envs_ascend` instead (vllm-project#2193)

### What this PR does / why we need it?
Remove redundant imported `envs`, using `envs_ascend` instead.

```python
import vllm.envs as envs_vllm
import vllm_ascend.envs as envs_ascend
```

- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@71683ca

---------

Signed-off-by: shen-shanshan <467638484@qq.com>

* feat(torchair): consider not using gmmswigluquant when torchair enabled

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(dtype): unify `w1_scale` dtype

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

* fix(lint): fix lint

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

---------

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>
Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Signed-off-by: haojiangzheng <justineric096@gmail.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: shen-shanshan <467638484@qq.com>
Co-authored-by: jack <QwertyJack@users.noreply.github.com>
Co-authored-by: zhenghaojiang <zhjoneson@163.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>
Co-authored-by: Shanshan Shen <467638484@qq.com>
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Sep 26, 2025
…llm-project#2193)

### What this PR does / why we need it?
Remove redundant imported `envs`, using `envs_ascend` instead.

```python
import vllm.envs as envs_vllm
import vllm_ascend.envs as envs_ascend
```

- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@71683ca

---------

Signed-off-by: shen-shanshan <467638484@qq.com>
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Oct 21, 2025
…llm-project#2193)

### What this PR does / why we need it?
Remove redundant imported `envs`, using `envs_ascend` instead.

```python
import vllm.envs as envs_vllm
import vllm_ascend.envs as envs_ascend
```

- vLLM version: v0.10.0
- vLLM main:
vllm-project/vllm@71683ca

---------

Signed-off-by: shen-shanshan <467638484@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants