-
Notifications
You must be signed in to change notification settings - Fork 532
Closed
Description
This is a living document!
Note that: vLLM Ascend 0.7.3 (match vLLM v0.7.3) is main release for 2025 Q1, see more in link.
Supported models track: #260
Hardware Plugin
Basic support
Initial vLLM Ascend support will start to support with basic hardware compatibility support.
- (P0) Chunked Prefill
- (P1) Automatic Prefix Caching (Improve performance)
- (P1) Speculative decoding
- (P1) Guided Decoding: [Feature]: Add Support for Guided Decoding (Structured Output) #177
- (P1) Multi step scheduler: support multistep decoding main #222
- LoRA
- Prompt adapter
Feature support
- (P0) V1 engine support: [RFC] V1 engine support #9
- (P1) EP support Expert Parallelism (EP) Support for DeepSeek Models vllm#12583
- (P1) Custom op [RFC]: Add support for custom ops #156
- (P1) MTP v0.7.3 Add MTP support for deepseek #236
- disaggregated prefill [Feature][Disaggregated] Support XpYd disaggregated prefill with MooncakeStore vllm#12957
- Scheduler plugin [do not merge] V1 scheduler interface vllm#12544
- RLHF Post train support - verl: Add Ascend NPU support for verl volcengine/verl#338
- RLHF Post train suppor - OpenRLHF: [WIP] support Ascend NPU backend OpenRLHF/OpenRLHF#605
Model support
- (P0) DeepSeek V3 / DeepSeek R1: [New Model]: DeepSeek V3 / R1 #72
- (P0) Llama3
- (P0) Qwen2.5
- (P0) Qwen2-VL: [New Model]: Qwen2-VL #246
- (P0) Qwen2.5-VL: [New Model]: Qwen2.5-VL #75
- (P1) BAAI/bge-m3: [New Model]: BAAI/bge-m3 #235
- MiniCPM
- GLM4
- InternLM
- llava
- GLM4v
- InternVL
Performance
- add vllm-ascend perf website like vLLM does https://perf.vllm.ai/
- focus on llama3, qwen2.5, qwen2-vl, deepseek v3/R1, improve the performance
Quality
- Full UT coverage
- Model e2e test
- Multi card/node e2e test
Docs
- README
- vllm-ascend website: https://vllm-ascend.readthedocs.org/
- Quick start / Installation / Turtorial
- User guide: supported feature / models
- Developer guide: Contributing / Versioning policy
CI and Developer Productivity
- vllm-ascend Docker image: [CI] Add container image build ci #64
- Ascend CI for main / dev branch: [Core] Init vllm-ascend #3
Yikun, shannanyinxiang, jeejeelee, hz0ne, ji-huazhong and 5 more
Metadata
Metadata
Assignees
Labels
No labels