Skip to content

Commit 51eadc6

Browse files
authored
[Docs] Add official doc index (#29)
Add official doc index. Move the release content to the right place. Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
1 parent 7006835 commit 51eadc6

File tree

10 files changed

+111
-200
lines changed

10 files changed

+111
-200
lines changed

README.md

Lines changed: 6 additions & 79 deletions
Original file line numberDiff line numberDiff line change
@@ -31,20 +31,11 @@ This plugin is the recommended approach for supporting the Ascend backend within
3131
By using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU.
3232

3333
## Prerequisites
34-
### Support Devices
35-
- Atlas A2 Training series (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2)
36-
- Atlas 800I A2 Inference series (Atlas 800I A2)
3734

38-
### Dependencies
39-
| Requirement | Supported version | Recommended version | Note |
40-
|-------------|-------------------| ----------- |------------------------------------------|
41-
| vLLM | main | main | Required for vllm-ascend |
42-
| Python | >= 3.9 | [3.10](https://www.python.org/downloads/) | Required for vllm |
43-
| CANN | >= 8.0.RC2 | [8.0.RC3](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.0.0.beta1) | Required for vllm-ascend and torch-npu |
44-
| torch-npu | >= 2.4.0 | [2.5.1rc1](https://gitee.com/ascend/pytorch/releases/tag/v6.0.0.alpha001-pytorch2.5.1) | Required for vllm-ascend |
45-
| torch | >= 2.4.0 | [2.5.1](https://github.com/pytorch/pytorch/releases/tag/v2.5.1) | Required for torch-npu and vllm |
35+
- Hardware: Atlas 800I A2 Inference series, Atlas A2 Training series
36+
- Software: vLLM (the same version as vllm-ascned), Python >= 3.9, CANN >= 8.0.RC2, PyTorch >= 2.4.0, torch-npu >= 2.4.0
4637

47-
Find more about how to setup your environment in [here](docs/environment.md).
38+
Find more about how to setup your environment step by step in [here](docs/installation.md).
4839

4940
## Getting Started
5041

@@ -73,78 +64,14 @@ Run the following command to start the vLLM server with the [Qwen/Qwen2.5-0.5B-I
7364
vllm serve Qwen/Qwen2.5-0.5B-Instruct
7465
curl http://localhost:8000/v1/models
7566
```
76-
77-
Please refer to [vLLM Quickstart](https://docs.vllm.ai/en/latest/getting_started/quickstart.html) for more details.
78-
79-
## Building
80-
81-
#### Build Python package from source
82-
83-
```bash
84-
git clone https://github.com/vllm-project/vllm-ascend.git
85-
cd vllm-ascend
86-
pip install -e .
87-
```
88-
89-
#### Build container image from source
90-
```bash
91-
git clone https://github.com/vllm-project/vllm-ascend.git
92-
cd vllm-ascend
93-
docker build -t vllm-ascend-dev-image -f ./Dockerfile .
94-
```
95-
96-
See [Building and Testing](./CONTRIBUTING.md) for more details, which is a step-by-step guide to help you set up development environment, build and test.
97-
98-
## Feature Support Matrix
99-
| Feature | Supported | Note |
100-
|---------|-----------|------|
101-
| Chunked Prefill || Plan in 2025 Q1 |
102-
| Automatic Prefix Caching || Imporve performance in 2025 Q1 |
103-
| LoRA || Plan in 2025 Q1 |
104-
| Prompt adapter |||
105-
| Speculative decoding || Impore accuracy in 2025 Q1|
106-
| Pooling || Plan in 2025 Q1 |
107-
| Enc-dec || Plan in 2025 Q1 |
108-
| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 |
109-
| LogProbs |||
110-
| Prompt logProbs |||
111-
| Async output |||
112-
| Multi step scheduler |||
113-
| Best of |||
114-
| Beam search |||
115-
| Guided Decoding || Plan in 2025 Q1 |
116-
117-
## Model Support Matrix
118-
119-
The list here is a subset of the supported models. See [supported_models](docs/supported_models.md) for more details:
120-
| Model | Supported | Note |
121-
|---------|-----------|------|
122-
| Qwen 2.5 |||
123-
| Mistral | | Need test |
124-
| DeepSeek v2.5 | |Need test |
125-
| LLama3.1/3.2 |||
126-
| Gemma-2 | |Need test|
127-
| baichuan | |Need test|
128-
| minicpm | |Need test|
129-
| internlm |||
130-
| ChatGLM |||
131-
| InternVL 2.5 |||
132-
| Qwen2-VL |||
133-
| GLM-4v | |Need test|
134-
| Molomo |||
135-
| LLaVA 1.5 |||
136-
| Mllama | |Need test|
137-
| LLaVA-Next | |Need test|
138-
| LLaVA-Next-Video | |Need test|
139-
| Phi-3-Vison/Phi-3.5-Vison | |Need test|
140-
| Ultravox | |Need test|
141-
| Qwen2-Audio |||
67+
**Please refer to [Official Docs](./docs/index.md) for more details.**
14268

14369
## Contributing
70+
See [CONTRIBUTING](./CONTRIBUTING.md) for more details, which is a step-by-step guide to help you set up development environment, build and test.
71+
14472
We welcome and value any contributions and collaborations:
14573
- Please feel free comments [here](https://github.com/vllm-project/vllm-ascend/issues/19) about your usage of vLLM Ascend Plugin.
14674
- Please let us know if you encounter a bug by [filing an issue](https://github.com/vllm-project/vllm-ascend/issues).
147-
- Please see the guidance on how to contribute in [CONTRIBUTING.md](./CONTRIBUTING.md).
14875

14976
## License
15077

README.zh.md

Lines changed: 9 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -30,21 +30,12 @@ vLLM 昇腾插件 (`vllm-ascend`) 是一个让vLLM在Ascend NPU无缝运行的
3030

3131
使用 vLLM 昇腾插件,可以让类Transformer、混合专家(MOE)、嵌入、多模态等流行的大语言模型在 Ascend NPU 上无缝运行。
3232

33-
## 前提
34-
### 支持的设备
35-
- Atlas A2 训练系列 (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2)
36-
- Atlas 800I A2 推理系列 (Atlas 800I A2)
37-
38-
### 依赖
39-
| 需求 | 支持的版本 | 推荐版本 | 注意 |
40-
|-------------|-------------------| ----------- |------------------------------------------|
41-
| vLLM | main | main | vllm-ascend 依赖 |
42-
| Python | >= 3.9 | [3.10](https://www.python.org/downloads/) | vllm 依赖 |
43-
| CANN | >= 8.0.RC2 | [8.0.RC3](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.0.0.beta1) | vllm-ascend and torch-npu 依赖 |
44-
| torch-npu | >= 2.4.0 | [2.5.1rc1](https://gitee.com/ascend/pytorch/releases/tag/v6.0.0.alpha001-pytorch2.5.1) | vllm-ascend 依赖 |
45-
| torch | >= 2.4.0 | [2.5.1](https://github.com/pytorch/pytorch/releases/tag/v2.5.1) | torch-npu and vllm 依赖 |
46-
47-
[此处](docs/environment.zh.md)了解更多如何配置您环境的信息。
33+
## 准备
34+
35+
- 硬件:Atlas 800I A2 Inference系列、Atlas A2 Training系列
36+
- 软件:vLLM(与vllm-ascn​​ed版本相同),Python >= 3.9,CANN >= 8.0.RC2,PyTorch >= 2.4.0,torch-npu >= 2.4.0
37+
38+
[此处](docs/installation.md) 中查找有关如何逐步设置环境的更多信息。
4839

4940
## 开始使用
5041

@@ -74,78 +65,14 @@ vllm serve Qwen/Qwen2.5-0.5B-Instruct
7465
curl http://localhost:8000/v1/models
7566
```
7667

77-
请参阅 [vLLM 快速入门](https://docs.vllm.ai/en/latest/getting_started/quickstart.html)以获取更多详细信息。
78-
79-
## 构建
80-
81-
#### 从源码构建Python包
82-
83-
```bash
84-
git clone https://github.com/vllm-project/vllm-ascend.git
85-
cd vllm-ascend
86-
pip install -e .
87-
```
88-
89-
#### 构建容器镜像
90-
```bash
91-
git clone https://github.com/vllm-project/vllm-ascend.git
92-
cd vllm-ascend
93-
docker build -t vllm-ascend-dev-image -f ./Dockerfile .
94-
```
95-
96-
查看[构建和测试](./CONTRIBUTING.zh.md)以获取更多详细信息,其中包含逐步指南,帮助您设置开发环境、构建和测试。
97-
98-
## 特性支持矩阵
99-
| Feature | Supported | Note |
100-
|---------|-----------|------|
101-
| Chunked Prefill || Plan in 2025 Q1 |
102-
| Automatic Prefix Caching || Imporve performance in 2025 Q1 |
103-
| LoRA || Plan in 2025 Q1 |
104-
| Prompt adapter |||
105-
| Speculative decoding || Impore accuracy in 2025 Q1|
106-
| Pooling || Plan in 2025 Q1 |
107-
| Enc-dec || Plan in 2025 Q1 |
108-
| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 |
109-
| LogProbs |||
110-
| Prompt logProbs |||
111-
| Async output |||
112-
| Multi step scheduler |||
113-
| Best of |||
114-
| Beam search |||
115-
| Guided Decoding || Plan in 2025 Q1 |
116-
117-
## 模型支持矩阵
118-
119-
此处展示了部分受支持的模型。有关更多详细信息,请参阅 [supported_models](docs/supported_models.md)
120-
| Model | Supported | Note |
121-
|---------|-----------|------|
122-
| Qwen 2.5 |||
123-
| Mistral | | Need test |
124-
| DeepSeek v2.5 | |Need test |
125-
| LLama3.1/3.2 |||
126-
| Gemma-2 | |Need test|
127-
| baichuan | |Need test|
128-
| minicpm | |Need test|
129-
| internlm |||
130-
| ChatGLM |||
131-
| InternVL 2.5 |||
132-
| Qwen2-VL |||
133-
| GLM-4v | |Need test|
134-
| Molomo |||
135-
| LLaVA 1.5 |||
136-
| Mllama | |Need test|
137-
| LLaVA-Next | |Need test|
138-
| LLaVA-Next-Video | |Need test|
139-
| Phi-3-Vison/Phi-3.5-Vison | |Need test|
140-
| Ultravox | |Need test|
141-
| Qwen2-Audio |||
142-
68+
**请参阅 [官方文档](./docs/index.md)以获取更多详细信息**
14369

14470
## 贡献
71+
有关更多详细信息,请参阅 [CONTRIBUTING](./CONTRIBUTING.md),可以更详细的帮助您部署开发环境、构建和测试。
72+
14573
我们欢迎并重视任何形式的贡献与合作:
14674
- 您可以在[这里](https://github.com/vllm-project/vllm-ascend/issues/19)反馈您的使用体验。
14775
- 请通过[提交问题](https://github.com/vllm-project/vllm-ascend/issues)来告知我们您遇到的任何错误。
148-
- 请参阅 [CONTRIBUTING.zh.md](./CONTRIBUTING.zh.md) 中的贡献指南。
14976

15077
## 许可证
15178

docs/environment.zh.md

Lines changed: 0 additions & 38 deletions
This file was deleted.

docs/index.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Ascend plugin for vLLM
2+
vLLM Ascend plugin (vllm-ascend) is a community maintained hardware plugin for running vLLM on the Ascend NPU.
3+
4+
This plugin is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the [[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162), providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM.
5+
6+
By using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU.
7+
8+
## Contents
9+
10+
- [Quick Start](./quick_start.md)
11+
- [Installation](./installation.md)
12+
- Usage
13+
- [Running vLLM with Ascend](./usage/running_vllm_with_ascend.md)
14+
- [Feature Support](./usage/feature_support.md)
15+
- [Supported Models](./usage/supported_models.md)

docs/environment.md renamed to docs/installation.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,23 @@
1+
# Installation
2+
3+
4+
## Building
5+
6+
#### Build Python package from source
7+
8+
```bash
9+
git clone https://github.com/vllm-project/vllm-ascend.git
10+
cd vllm-ascend
11+
pip install -e .
12+
```
13+
14+
#### Build container image from source
15+
```bash
16+
git clone https://github.com/vllm-project/vllm-ascend.git
17+
cd vllm-ascend
18+
docker build -t vllm-ascend-dev-image -f ./Dockerfile .
19+
```
20+
121
### Prepare Ascend NPU environment
222

323
### Dependencies

docs/quick_start.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Quick Start
2+
3+
## Prerequisites
4+
### Support Devices
5+
- Atlas A2 Training series (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2)
6+
- Atlas 800I A2 Inference series (Atlas 800I A2)
7+
8+
### Dependencies
9+
| Requirement | Supported version | Recommended version | Note |
10+
|-------------|-------------------| ----------- |------------------------------------------|
11+
| vLLM | main | main | Required for vllm-ascend |
12+
| Python | >= 3.9 | [3.10](https://www.python.org/downloads/) | Required for vllm |
13+
| CANN | >= 8.0.RC2 | [8.0.RC3](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.0.0.beta1) | Required for vllm-ascend and torch-npu |
14+
| torch-npu | >= 2.4.0 | [2.5.1rc1](https://gitee.com/ascend/pytorch/releases/tag/v6.0.0.alpha001-pytorch2.5.1) | Required for vllm-ascend |
15+
| torch | >= 2.4.0 | [2.5.1](https://github.com/pytorch/pytorch/releases/tag/v2.5.1) | Required for torch-npu and vllm |
16+
17+
Find more about how to setup your environment in [here](docs/environment.md).

docs/supported_models.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

docs/usage/feature_support.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Feature Support
2+
3+
| Feature | Supported | Note |
4+
|---------|-----------|------|
5+
| Chunked Prefill || Plan in 2025 Q1 |
6+
| Automatic Prefix Caching || Improve performance in 2025 Q1 |
7+
| LoRA || Plan in 2025 Q1 |
8+
| Prompt adapter |||
9+
| Speculative decoding || Improve accuracy in 2025 Q1|
10+
| Pooling || Plan in 2025 Q1 |
11+
| Enc-dec || Plan in 2025 Q1 |
12+
| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 |
13+
| LogProbs |||
14+
| Prompt logProbs |||
15+
| Async output |||
16+
| Multi step scheduler |||
17+
| Best of |||
18+
| Beam search |||
19+
| Guided Decoding || Plan in 2025 Q1 |
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Running vLLM with Ascend

docs/usage/supported_models.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Supported Models
2+
3+
| Model | Supported | Note |
4+
|---------|-----------|------|
5+
| Qwen 2.5 |||
6+
| Mistral | | Need test |
7+
| DeepSeek v2.5 | |Need test |
8+
| LLama3.1/3.2 |||
9+
| Gemma-2 | |Need test|
10+
| baichuan | |Need test|
11+
| minicpm | |Need test|
12+
| internlm |||
13+
| ChatGLM |||
14+
| InternVL 2.5 |||
15+
| Qwen2-VL |||
16+
| GLM-4v | |Need test|
17+
| Molomo |||
18+
| LLaVA 1.5 |||
19+
| Mllama | |Need test|
20+
| LLaVA-Next | |Need test|
21+
| LLaVA-Next-Video | |Need test|
22+
| Phi-3-Vison/Phi-3.5-Vison | |Need test|
23+
| Ultravox | |Need test|
24+
| Qwen2-Audio |||

0 commit comments

Comments
 (0)