|
1 | 1 | # Feature Support |
2 | 2 |
|
3 | | -| Feature | Supported | CI Coverage | Guidance Document | Current Status | Next Step | |
4 | | -|--------------------------|-----------|-------------|-------------------|---------------------------|--------------------| |
5 | | -| Chunked Prefill | ❌ | | | NA | Rely on CANN 8.1 NNAL package release | |
6 | | -| Automatic Prefix Caching | ✅ | | | Basic functions available | Rely on CANN 8.1 NNAL package release | |
7 | | -| LoRA | ❌ | | | NA | Plan in 2025.06.30 | |
8 | | -| Prompt adapter | ❌ | | | NA | Plan in 2025.06.30 | |
9 | | -| Speculative decoding | ✅ | | | Basic functions available | Need fully test | |
10 | | -| Pooling | ✅ | | | Basic functions available(Bert) | Need fully test and add more models support| |
11 | | -| Enc-dec | ❌ | | | NA | Plan in 2025.06.30| |
12 | | -| Multi Modality | ✅ | | ✅ | Basic functions available(LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Improve perforamance, and add more models support | |
13 | | -| LogProbs | ✅ | | | Basic functions available | Need fully test | |
14 | | -| Prompt logProbs | ✅ | | | Basic functions available | Need fully test | |
15 | | -| Async output | ✅ | | | Basic functions available | Need fully test | |
16 | | -| Multi step scheduler | ✅ | | | Basic functions available | Need fully test, Find more details at [<u> Blog </u>](https://blog.vllm.ai/2024/09/05/perf-update.html#batch-scheduling-multiple-steps-ahead-pr-7000), [<u> RFC </u>](https://github.com/vllm-project/vllm/issues/6854) and [<u>issue</u>](https://github.com/vllm-project/vllm/pull/7000) | |
17 | | -| Best of | ✅ | | | Basic functions available | Need fully test | |
18 | | -| Beam search | ✅ | | | Basic functions available | Need fully test | |
19 | | -| Guided Decoding | ✅ | | | Basic functions available | Find more details at the [<u>issue</u>](https://github.com/vllm-project/vllm-ascend/issues/177) | |
20 | | -| Tensor Parallel | ✅ | | | Basic functions available | Need fully test | |
21 | | -| Pipeline Parallel | ✅ | | | Basic functions available | Need fully test | |
| 3 | +The feature support principle of vLLM Ascend is: **aligned with the vLLM**. We are also actively collaborating with the community to accelerate support. |
| 4 | + |
| 5 | +vLLM Ascend offers the overall functional support of the most features in vLLM, and the usage keep the same with vLLM except for some limits. |
| 6 | + |
| 7 | +```{note} |
| 8 | +MindIE Turbo is an optional performace optimization plugin. Find more information about the feature support of MindIE Turbo here(UPDATE_ME_AS_A_LINK). |
| 9 | +``` |
| 10 | + |
| 11 | +| Feature | vLLM Ascend | MindIE Turbo | Notes | |
| 12 | +|-------------------------------|----------------|-----------------|------------------------------------------------------------------------| |
| 13 | +| V1Engine | 🔵 Experimental| 🔵 Experimental| Will enhance in v0.8.x | |
| 14 | +| Chunked Prefill | 🟢 Functional | 🟢 Functional | / | |
| 15 | +| Automatic Prefix Caching | 🟢 Functional | 🟢 Functional | [Usage Limits][#732](https://github.com/vllm-project/vllm-ascend/issues/732) | |
| 16 | +| LoRA | 🟢 Functional | 🟢 Functional | / | |
| 17 | +| Prompt adapter | 🟡 Planned | 🟡 Planned | / | |
| 18 | +| Speculative decoding | 🟢 Functional | 🟢 Functional | [Usage Limits][#734](https://github.com/vllm-project/vllm-ascend/issues/734) | |
| 19 | +| Pooling | 🟢 Functional | 🟢 Functional | / | |
| 20 | +| Enc-dec | 🟡 Planned | 🟡 Planned | / | |
| 21 | +| Multi Modality | 🟢 Functional | 🟢 Functional | / | |
| 22 | +| LogProbs | 🟢 Functional | 🟢 Functional | / | |
| 23 | +| Prompt logProbs | 🟢 Functional | 🟢 Functional | / | |
| 24 | +| Async output | 🟢 Functional | 🟢 Functional | / | |
| 25 | +| Multi step scheduler | 🟢 Functional | 🟢 Functional | / | |
| 26 | +| Best of | 🟢 Functional | 🟢 Functional | / | |
| 27 | +| Beam search | 🟢 Functional | 🟢 Functional | / | |
| 28 | +| Guided Decoding | 🟢 Functional | 🟢 Functional | / | |
| 29 | +| Tensor Parallel | 🟢 Functional | ⚡Optimized | / | |
| 30 | +| Pipeline Parallel | 🟢 Functional | ⚡Optimized | / | |
| 31 | +| Expert Parallel | 🟡 Planned | 🟡 Planned | Will support in v0.8.x | |
| 32 | +| Data Parallel | 🟡 Planned | 🟡 Planned | Will support in v0.8.x | |
| 33 | +| Prefill Decode Disaggregation | 🟢 Functional | 🟢 Functional | todo | |
| 34 | +| Quantization | 🟡 Planned | 🟢 Functional | Will support in v0.8.x | |
| 35 | +| Graph Mode | 🟡 Planned | 🟡 Planned | Will support in v0.8.x | |
| 36 | +| Sleep Mode | 🟢 Functional | 🟢 Functional | [Usage Limits][#733](https://github.com/vllm-project/vllm-ascend/issues/733) | |
| 37 | +| MTP | 🟢 Functional | 🟢 Functional | [Usage Limits][#734](https://github.com/vllm-project/vllm-ascend/issues/734) | |
| 38 | +| Custom Scheduler | 🟢 Functional | 🟢 Functional | [Usage Limits][#788](https://github.com/vllm-project/vllm-ascend/issues/788) | |
0 commit comments