- 
                Notifications
    You must be signed in to change notification settings 
- Fork 530
Closed
Description
This issue tracks the checklist for official v0.7.3 release
Code develop
- Wait for CANN8.1 release, then update dockerfile base image Upgrade CANN version to 8.1.rc1 #747 @Yikun
-  Update torch-npu to 2.5.1 official release @MengqingCao
 [v0.7.3][Build] Upgrade torch-npu to 2.5.1 #662
-  PR waiting for merge/review/close @wangxiyuan
 [0.7.3] Optimize apply_penalties & topKtopP for both V0/V1 Engine #525 @linfeng-yuan
 [Doc] Update v0.7.3 faqs #695
 [ModelRunnerV1] Adapt kv_cache quant in v1. #685
 [Misc] Add v0.7.3 benchmark #678
 [0.7.3] optimize qwen2_vl and qwen2_5_vl #702
-  lora support cherry-pick @paulyu12  @Yikun
 Add LoRA & Multi-LoRA support for V0.7.3 dev by Cherry Pick #700
-  write release note @Yikun
 [Doc] Add release note for 0.7.3 #735
-  CPU memory overleak @celestialli
 [0.7.3] patch from_seq_group to clear finished seq in seq_id_to_seq_group #691
Documant enhancement
- 
Installation @MengqingCao 
 [Build][0.7.3] Integrate MindIE Turbo into vLLM Ascend #708-  install from source code
- vllm
- vllm-ascend[mindie-turbo]
 
-  install from binary
- vllm
- vllm-ascend
- mindie-turbo
 
- install with docker
 
-  install from source code
- 
User Guide - Use ascend scheduler with V1 Engine @MengqingCao [Guide]: Usage on AscendScheduler in vLLM Ascend #788
- Improve performance with python and pytorch @wangxiyuan [Doc] Add release note for 0.7.3 #735
-  Update doc to address compile enhancement @MengqingCao
 [Build][0.7.3] Integrate MindIE Turbo into vLLM Ascend #708
-  FAQ cherry-pick @Potabk
 [Doc] Update v0.7.3 faqs #695
 [v0.7.3][Doc] Add notes for OOM in FAQs (#786) #795
-  Feature support update @MengqingCao
 [Build][0.7.3] Integrate MindIE Turbo into vLLM Ascend #708
-  Model support update @MengqingCao
 [Build][0.7.3] Integrate MindIE Turbo into vLLM Ascend #708
-  Accurary report @hfadzxy [v0.7.3][Doc] Add accuracy report #793
 Add index page once the report exist.
- Performance feedback issue: [Guide][Performance]: vllm-ascend v0.7.3 release performance benchmark #776 @Potabk
 
- 
Developer Guide - Update Release Compatibility Matrix include mindie-turbo verion: [Doc] Add release note for 0.7.3 #735 @Yikun
 
Function and Model Test
-  key models:
- qwen2.5
- deepseek-v3
- qwen2.5-vl
 
-  features
 If the certain feature usage is different from the original usage in vllm, we need to add one for vllm-ascend[mindie-turbo]-  chunked prefill @MengqingCao
 rely on CANN 8.1 nnal
- custom ops @celestialli
- guided decoding – same as vllm @shen-shanshan
-  sleep mode @celestialli
-  create an issue to track the sleep mode @celestialli
 [Guide]: Sleep mode feature guide #733
-  update feature support list to link to the issue @MengqingCao
 [Build][0.7.3] Integrate MindIE Turbo into vLLM Ascend #708
 
-  create an issue to track the sleep mode @celestialli
- speculative decoding – same as vllm @MengqingCao
- multi-step scheduler – same as vllm @MengqingCao
- mtp – same as vllm @MengqingCao
- prefix cache @Potabk
- pooling model – same as vllm @MengqingCao
- V1Engine @shen-shanshan
-  distribution @shen-shanshan
- tp
- pp
 
 
-  chunked prefill @MengqingCao
Release artifacts @wangxiyuan
-  accuracy report @hfadzxy [v0.7.3][Doc] Add accuracy report #793
 Need generate the report by hand.
- pypi package @MengqingCao https://pypi.org/project/vllm-ascend/0.7.3/
- docker image @Yikun https://github.com/vllm-project/vllm-ascend/actions/runs/14872918023/job/41866668626?pr=730
Metadata
Metadata
Assignees
Labels
No labels