-
Notifications
You must be signed in to change notification settings - Fork 532
Description
Release Checklist
Release Version: v0.10.0rc1
Release Branch: main
Release Date:
Release Manager: MengqingCao
Prepare Release Note
-
Create a new issue for release feedback [v0.10.0rc1] FAQ / Feedback | 问题/反馈 #2217
-
Write the release note PR.
-
Update the feedback issue link in docs/source/faqs.md
-
Add release note to docs/source/user_guide/release_notes.md
-
Update version info in docs/source/community/versioning_policy.md
-
Update contributor info in docs/source/community/contributors.md
-
Update package version in docs/conf.py -- do me after release
-
PR need Merge
- [BugFix] Fix the bug that qwen3 moe doesn't work with aclgraph #2183
- [V1] MTP supports torchair #2145
- [Doc] Support kimi-k2-w8a8 #2162
- [Bugfix] Fix disaggregated pd error #2242
- 【main】SP For Qwen3 MoE #2209
Functional Test
-
ALL tests with
VLLM_ASCEND_ENABLE_MATMUL_ALLREDUCEenabled -
Disaggregate prefill @Potabk
- deepseek
- Disaggregate prefill on multi-node + deepseek + torchair graph mode + V1 scheduler + dp + ep + tp
- Disaggregate prefill on single-node + deepseek + torchair graph mode + V1 scheduler + dp + ep + tp
- Disaggregate prefill on multi-node + deepseek + torchair graph mode + AscendScheduler (disabling chunked prefill) + dp + ep + tp
- Disaggregate prefill on single-node + deepseek + torchair graph mode + AscendScheduler (disabling chunked prefill) + dp + ep + tp
- qwen3 moe
- Disaggregate prefill on multi-node + qwen3 235B + aclgraph + V1 scheduler + dp + ep + tp
- Disaggregate prefill on single-node + qwen3 30B + aclgraph + V1 scheduler + dp + ep + tp
- deepseek
-
qwen3 moe + eager mode + all2allv: enabling
VLLM_ASCEND_ENABLE_MOE_ALL2ALL_SEQ@wangxiyuan -
deepseek v3 + torchair + all2allv: enabling
VLLM_ASCEND_ENABLE_MOE_ALL2ALL_SEQ@Potabk -
Aclgraph + qwen3 moe + dp +tp @MengqingCao [Release]: Release checklist for v0.10.0rc1 #2210 (comment)
-
Eager mode + qwen3 moe + dp +tp @MengqingCao [Release]: Release checklist for v0.10.0rc1 #2210 (comment)
-
spec decode @shen-shanshan
- deepseek-w8a8 + torchair graph mode+ mtp -- rely on [V1] MTP supports torchair #2145
- eagle3
- ngram
-
w8a8 + enabling nz -- performance @MengqingCao
-
deepseek w8a8 dynamic + multi-stream @zhangxinyuehfad [Release]: Release checklist for v0.10.0rc1 #2210 (comment)
-
w4a8 + qwen3-8b @22dimensions [Release]: Release checklist for v0.10.0rc1 #2210 (comment)
-
numpy > 2.0 with CANN 8.2.1 @MengqingCao -- not work with numpy>2.0 [Release]: Release checklist for v0.10.0rc1 #2210 (comment)
-
lora perf improve @taoxudonghaha
Doc Test
- Tutorial is updated.
- User Guide is updated.
- Developer Guide is updated.
Prepare Artifacts
- Docker image is ready.
- A3 image check @Potabk
- Wheel package is ready.
Release Step
- Release note PR is merged.
- Post the release on GitHub release page.
- Generate official doc page on https://app.readthedocs.org/dashboard/
- Wait for the wheel package to be available on https://pypi.org/project/vllm-ascend
- Wait for the docker image to be available on https://quay.io/ascend/vllm-ascend
- Upload 310p wheel to Github release page
- Broadcast the release news (By message, blog , etc)
- Close this issue
- Update 900-release-checklist.yml
- Pin feedback issue