[CORE]initial support for torchair with non-mla backend #1506

Angazenn · 2025-06-28T12:58:33Z

What this PR does / why we need it?

This PR supports torchair graph mode with non-mla backend on both 800IA2 and 300I Duo platforms. The main change is to add attention_v1_torchair.py to support specific attention related operations that are required by torchair.

Does this PR introduce any user-facing change?

Before this PR, vLLM-Ascend only allows deepseek to use torchair. Now we can also use it with pangu. Besides, we add a support model list to control which type of models that can use torchair.

How was this patch tested?

We have test it with PanguProMoE on both 800IA2 and 300I Duo platforms, and model generates answer normally.

codecov · 2025-06-30T04:11:55Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 51.60%. Comparing base (c30ddb8) to head (a129b93).
⚠️ Report is 613 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1506       +/-   ##
===========================================
+ Coverage   27.39%   51.60%   +24.20%     
===========================================
  Files          56       78       +22     
  Lines        6191     9474     +3283     
===========================================
+ Hits         1696     4889     +3193     
- Misses       4495     4585       +90

Flag	Coverage Δ
unittests	`51.60% <100.00%> (+24.20%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-07-01T04:13:05Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2025-07-02T09:48:28Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

wangxiyuan

TODO:
move all torchair attention code to attention_v1_torchiar in the future

Signed-off-by: angazenn <zengyanjia@huawei.com>

Signed-off-by: tianyitang <tangtianyi4@huawei.com>

Signed-off-by: angazenn <zengyanjia@huawei.com>

Signed-off-by: tianyitang <tangtianyi4@huawei.com>

Signed-off-by: angazenn <zengyanjia@huawei.com>

…#1506) ### What this PR does / why we need it? This PR supports torchair graph mode with non-mla backend on both 800IA2 and 300I Duo platforms. The main change is to add `attention_v1_torchair.py` to support specific attention related operations that are required by torchair. ### Does this PR introduce _any_ user-facing change? Before this PR, vLLM-Ascend only allows deepseek to use torchair. Now we can also use it with pangu. Besides, we add a support model list to control which type of models that can use torchair. ### How was this patch tested? We have test it with PanguProMoE on both 800IA2 and 300I Duo platforms, and model generates answer normally. --------- Signed-off-by: angazenn <zengyanjia@huawei.com> Signed-off-by: tianyitang <tangtianyi4@huawei.com> Co-authored-by: angazenn <zengyanjia@huawei.com> Co-authored-by: tianyitang <tangtianyi4@huawei.com>

github-actions bot added module:ops module:core labels Jun 28, 2025

Angazenn force-pushed the torchair branch 4 times, most recently from f32a5e3 to d8c7ecd Compare June 30, 2025 03:57

Angazenn force-pushed the torchair branch 6 times, most recently from 88cd32f to 31f0d92 Compare June 30, 2025 11:18

github-actions bot added the module:tests label Jul 1, 2025

Angazenn force-pushed the torchair branch 2 times, most recently from 7427ef8 to 571ba99 Compare July 1, 2025 04:11

github-actions bot added merge-conflicts and removed merge-conflicts labels Jul 1, 2025

Angazenn force-pushed the torchair branch 4 times, most recently from c4a7f85 to c5a9254 Compare July 1, 2025 07:27

github-actions bot added the documentation Improvements or additions to documentation label Jul 1, 2025

Angazenn changed the title ~~[WIP]initial support for torchair with non-mla backend~~ [CORE]initial support for torchair with non-mla backend Jul 1, 2025

github-actions bot added the module:quantization label Jul 1, 2025

Angazenn force-pushed the torchair branch 3 times, most recently from 83e54bc to a88bd2c Compare July 2, 2025 01:00

Yikun mentioned this pull request Jul 2, 2025

[Release]: Release checklist for v0.9.1rc2 on main #1486

Closed

47 tasks

github-actions bot removed the merge-conflicts label Jul 2, 2025

Angazenn force-pushed the torchair branch 4 times, most recently from ad27bb9 to 62998bb Compare July 2, 2025 12:42

wangxiyuan approved these changes Jul 3, 2025

View reviewed changes

angazenn and others added 19 commits July 3, 2025 18:46

initial support for torchair with non-mla backend

3d34d95

Signed-off-by: angazenn <zengyanjia@huawei.com>

modify check

a6e0dbc

Signed-off-by: angazenn <zengyanjia@huawei.com>

refine attention torchair v1 codes

224bb15

Signed-off-by: angazenn <zengyanjia@huawei.com>

fix build

72a8c64

Signed-off-by: angazenn <zengyanjia@huawei.com>

add e2e test

c9e8424

Signed-off-by: angazenn <zengyanjia@huawei.com>

modify torchair backend

4ed8483

Signed-off-by: angazenn <zengyanjia@huawei.com>

fix lint

d5d8271

Signed-off-by: angazenn <zengyanjia@huawei.com>

add docs and some ut

78860f8

Signed-off-by: angazenn <zengyanjia@huawei.com>

change doc description

64e2693

Signed-off-by: angazenn <zengyanjia@huawei.com>

supply ut for platform and rotary_embedding

2eb4d9d

Signed-off-by: tianyitang <tangtianyi4@huawei.com>

fix lint

0bb6a6b

Signed-off-by: angazenn <zengyanjia@huawei.com>

fix branch

324319d

Signed-off-by: angazenn <zengyanjia@huawei.com>

fix torchair decode meta error

0dc3b20

Signed-off-by: angazenn <zengyanjia@huawei.com>

rteduce reduce_scatter patch

33eaebb

Signed-off-by: angazenn <zengyanjia@huawei.com>

fix e2e test model name

92c8376

Signed-off-by: angazenn <zengyanjia@huawei.com>

fix compatibility with main

45f91cf

Signed-off-by: angazenn <zengyanjia@huawei.com>

supply ut for new branch in w8a8

4f8d2cd

Signed-off-by: tianyitang <tangtianyi4@huawei.com>

fix ut

8b842ff

Signed-off-by: angazenn <zengyanjia@huawei.com>

fix ut

a129b93

Signed-off-by: angazenn <zengyanjia@huawei.com>

Angazenn force-pushed the torchair branch from f8d0a1a to a129b93 Compare July 3, 2025 10:46

wangxiyuan merged commit a5f3359 into vllm-project:main Jul 3, 2025
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CORE]initial support for torchair with non-mla backend #1506

[CORE]initial support for torchair with non-mla backend #1506

Uh oh!

Angazenn commented Jun 28, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jun 30, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 1, 2025

Uh oh!

github-actions bot commented Jul 2, 2025

Uh oh!

wangxiyuan left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[CORE]initial support for torchair with non-mla backend #1506

[CORE]initial support for torchair with non-mla backend #1506

Uh oh!

Conversation

Angazenn commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

codecov bot commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jul 1, 2025

Uh oh!

github-actions bot commented Jul 2, 2025

Uh oh!

wangxiyuan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Angazenn commented Jun 28, 2025 •

edited

Loading

codecov bot commented Jun 30, 2025 •

edited

Loading