Skip to content

Commit 0bbdca0

Browse files
committed
[Doc] Refactor the DeepSeek-V3.2-Exp tutorial.
Signed-off-by: menogrey <1299267905@qq.com>
1 parent 6d95489 commit 0bbdca0

File tree

3 files changed

+41
-164
lines changed

3 files changed

+41
-164
lines changed

docs/source/conf.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,9 @@
8080
'ci_vllm_version': 'v0.11.0rc3',
8181
}
8282

83+
# For cross-file header anchors
84+
myst_heading_anchors = 5
85+
8386
# Add any paths that contain templates here, relative to this directory.
8487
templates_path = ['_templates']
8588

docs/source/installation.md

Lines changed: 26 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ There are two installation methods:
2020
- **Using pip**: first prepare env manually or via CANN image, then install `vllm-ascend` using pip.
2121
- **Using docker**: use the `vllm-ascend` pre-built docker image directly.
2222

23-
## Configure a new environment
23+
## Configure a Ascend CANN environment
2424

2525
Before installation, you need to make sure firmware/driver and CANN are installed correctly, refer to [Ascend Environment Setup Guide](https://ascend.github.io/docs/sources/ascend/quick_install.html) for more details.
2626

@@ -109,14 +109,7 @@ No more extra step if you are using `vllm-ascend` prebuilt Docker image.
109109

110110
Once it is done, you can start to set up `vllm` and `vllm-ascend`.
111111

112-
## Setup vllm and vllm-ascend
113-
114-
:::::{tab-set}
115-
:sync-group: install
116-
117-
::::{tab-item} Using pip
118-
:selected:
119-
:sync: pip
112+
## Set up using Python
120113

121114
First install system dependencies and configure pip mirror:
122115

@@ -181,12 +174,19 @@ To build custom operators, gcc/g++ higher than 8 and c++ 17 or higher is require
181174
If you encounter other problems during compiling, it is probably because unexpected compiler is being used, you may export `CXX_COMPILER` and `C_COMPILER` in environment to specify your g++ and gcc locations before compiling.
182175
```
183176

184-
::::
177+
## Set up using Docker
185178

186-
::::{tab-item} Using docker
187-
:sync: docker
179+
`vllm-ascend` offers Docker images for deployment. You can just pull the **prebuilt image** from the image repository [ascend/vllm-ascend](https://quay.io/repository/ascend/vllm-ascend?tab=tags) and run it with bash.
188180

189-
You can just pull the **prebuilt image** and run it with bash.
181+
Supported images as following.
182+
| image name | Hardware | OS |
183+
|-|-|-|
184+
| image-tag | Atlas A2 | Ubuntu |
185+
| image-tag-openeuler | Atlas A2 | openEuler |
186+
| image-tag-a3 | Atlas A3 | Ubuntu |
187+
| image-tag-a3-openeuler | Atlas A3 | openEuler |
188+
| image-tag-310p | Atlas 300I | Ubuntu |
189+
| image-tag-310p-openeuler | Atlas 300I | openEuler |
190190

191191
:::{dropdown} Click here to see "Build from Dockerfile"
192192
or build IMAGE from **source code**:
@@ -202,18 +202,27 @@ docker build -t vllm-ascend-dev-image:latest -f ./Dockerfile .
202202
```{code-block} bash
203203
:substitutions:
204204
205-
# Update DEVICE according to your device (/dev/davinci[0-7])
206-
export DEVICE=/dev/davinci7
207-
# Update the vllm-ascend image
205+
# Update --device according to your device (Atlas A2: /dev/davinci[0-7] Atlas A3:/dev/davinci[0-15]).
206+
# Update the vllm-ascend image according to your environment.
207+
# Note you should download the weight to /root/.cache in advance.
208208
export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|
209209
docker run --rm \
210210
--name vllm-ascend-env \
211211
--shm-size=1g \
212-
--device $DEVICE \
212+
--net=host \
213+
--device /dev/davinci0 \
214+
--device /dev/davinci1 \
215+
--device /dev/davinci2 \
216+
--device /dev/davinci3 \
217+
--device /dev/davinci4 \
218+
--device /dev/davinci5 \
219+
--device /dev/davinci6 \
220+
--device /dev/davinci7 \
213221
--device /dev/davinci_manager \
214222
--device /dev/devmm_svm \
215223
--device /dev/hisi_hdc \
216224
-v /usr/local/dcmi:/usr/local/dcmi \
225+
-v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
217226
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
218227
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
219228
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
@@ -223,9 +232,7 @@ docker run --rm \
223232
```
224233

225234
The default workdir is `/workspace`, vLLM and vLLM Ascend code are placed in `/vllm-workspace` and installed in [development mode](https://setuptools.pypa.io/en/latest/userguide/development_mode.html) (`pip install -e`) to help developer immediately take place changes without requiring a new installation.
226-
::::
227235

228-
:::::
229236

230237
## Extra information
231238

docs/source/tutorials/DeepSeek-V3.2-Exp.md

Lines changed: 12 additions & 145 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@ Only machines with AArch64 are supported currently. x86 will be supported soon.
1010

1111
## Supported Features
1212

13-
Refer to [](../user_guide/support_matrix/supported_models.md) to get the model's detail.
13+
Refer to [supported models](../user_guide/support_matrix/supported_models.md) to get the model's detail.
1414

15-
Refer to [](../user_guide/support_matrix/supported_features.md) to get the supported features.
15+
Refer to [supported features](../user_guide/support_matrix/supported_features.md) to get the supported features.
1616

1717
## Environment
1818

@@ -21,64 +21,25 @@ Refer to [](../user_guide/support_matrix/supported_features.md) to get the suppo
2121
- `DeepSeek-V3.2-Exp`: require 2 Atlas 800 A3 (64G × 16) nodes or 4 Atlas 800 A2 (64G × 8). [Model weight link](https://modelers.cn/models/Modelers_Park/DeepSeek-V3.2-Exp-BF16)
2222
- `DeepSeek-V3.2-Exp-w8a8`: require 1 Atlas 800 A3 (64G × 16) node or 2 Atlas 800 A2 (64G × 8). [Model weight link](https://modelers.cn/models/Modelers_Park/DeepSeek-V3.2-Exp-w8a8)
2323

24+
25+
### Verify Multi-node Communication(Optional)
26+
27+
If you want to deploy multi-node environment, you need to verify multi-node communication according to [verify multi-node communication environment](../installation.md#verify-multi-node-communication-environment).
28+
2429
### Installation
2530

26-
Currently, we provide the all-in-one images `quay.io/ascend/vllm-ascend:v0.11.0rc0-deepseek-v3.2-exp`(for Atlas 800 A2) and `quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp`(for Atlas 800 A3). These images include CANN 8.2RC1 + [SparseFlashAttention/LightningIndexer](https://gitcode.com/cann/cann-recipes-infer/tree/master/ops/ascendc) + [MLAPO](https://github.com/vllm-project/vllm-ascend/pull/3226). You can also build your own image by referring to [link](https://github.com/vllm-project/vllm-ascend/issues/3278) and [](../installation.md).
31+
Currently, we provide the all-in-one images `quay.io/ascend/vllm-ascend:v0.11.0rc0-deepseek-v3.2-exp`(for Atlas 800 A2) and `quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp`(for Atlas 800 A3). These images include CANN 8.2RC1 + [SparseFlashAttention/LightningIndexer](https://gitcode.com/cann/cann-recipes-infer/tree/master/ops/ascendc) + [MLAPO](https://github.com/vllm-project/vllm-ascend/pull/3226). You can also build your own image by referring to [link](https://github.com/vllm-project/vllm-ascend/issues/3278).
32+
33+
Refer to [installation](../installation.md#set-up-using-docker) to set up environment using Docker.
34+
35+
If you want to deploy multi-node environment, you need to set up envrionment on each node.
2736

2837
## Deployment
2938

3039
### Single-node Deployment
3140

3241
Only the quantized model `DeepSeek-V3.2-Exp-w8a8` can be deployed on 1 Atlas 800 A3.
3342

34-
Run the following command to start the container in each node (You should download the weight to /root/.cache in advance):
35-
36-
```{code-block} bash
37-
:substitutions:
38-
# Update the vllm-ascend image
39-
# openEuler:
40-
# export IMAGE=quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-openeuler-deepseek-v3.2-exp
41-
# Ubuntu:
42-
# export IMAGE=quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp
43-
export IMAGE=quay.nju.edu.cn/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp
44-
export NAME=vllm-ascend
45-
46-
# Run the container using the defined variables
47-
# Note if you are running bridge network with docker, Please expose available ports
48-
# for multiple nodes communication in advance
49-
docker run --rm \
50-
--name $NAME \
51-
--net=host \
52-
--shm-size=1g \
53-
--device /dev/davinci0 \
54-
--device /dev/davinci1 \
55-
--device /dev/davinci2 \
56-
--device /dev/davinci3 \
57-
--device /dev/davinci4 \
58-
--device /dev/davinci5 \
59-
--device /dev/davinci6 \
60-
--device /dev/davinci7 \
61-
--device /dev/davinci8 \
62-
--device /dev/davinci9 \
63-
--device /dev/davinci10 \
64-
--device /dev/davinci11 \
65-
--device /dev/davinci12 \
66-
--device /dev/davinci13 \
67-
--device /dev/davinci14 \
68-
--device /dev/davinci15 \
69-
--device /dev/davinci_manager \
70-
--device /dev/devmm_svm \
71-
--device /dev/hisi_hdc \
72-
-v /usr/local/dcmi:/usr/local/dcmi \
73-
-v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
74-
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
75-
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
76-
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
77-
-v /etc/ascend_install.info:/etc/ascend_install.info \
78-
-v /root/.cache:/root/.cache \
79-
-it $IMAGE bash
80-
```
81-
8243
Run the following script to execute online inference.
8344

8445
```shell
@@ -107,100 +68,6 @@ vllm serve vllm-ascend/DeepSeek-V3.2-Exp-W8A8 \
10768
- `DeepSeek-V3.2-Exp`: require 2 Atlas 800 A3 (64G × 16) nodes or 4 Atlas 800 A2 (64G × 8).
10869
- `DeepSeek-V3.2-Exp-w8a8`: require 2 Atlas 800 A2 (64G × 8).
10970

110-
Firstly, verify multi-node communication environment. [verify multi-node communication environment](https://vllm-ascend.readthedocs.io/en/latest/installation.html#verify-multi-node-communication-environment)
111-
112-
Then run the following command to start the container in each node (You should download the weight to /root/.cache in advance):
113-
114-
:::::{tab-set}
115-
::::{tab-item} A2 series
116-
117-
```{code-block} bash
118-
:substitutions:
119-
# Update the vllm-ascend image
120-
# export IMAGE=quay.io/ascend/vllm-ascend:v0.11.0rc0-deepseek-v3.2-exp
121-
export IMAGE=quay.nju.edu.cn/ascend/vllm-ascend:v0.11.0rc0-deepseek-v3.2-exp
122-
export NAME=vllm-ascend
123-
124-
# Run the container using the defined variables
125-
# Note if you are running bridge network with docker, Please expose available ports
126-
# for multiple nodes communication in advance
127-
docker run --rm \
128-
--name $NAME \
129-
--net=host \
130-
--shm-size=1g \
131-
--device /dev/davinci0 \
132-
--device /dev/davinci1 \
133-
--device /dev/davinci2 \
134-
--device /dev/davinci3 \
135-
--device /dev/davinci4 \
136-
--device /dev/davinci5 \
137-
--device /dev/davinci6 \
138-
--device /dev/davinci7 \
139-
--device /dev/davinci_manager \
140-
--device /dev/devmm_svm \
141-
--device /dev/hisi_hdc \
142-
-v /usr/local/dcmi:/usr/local/dcmi \
143-
-v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
144-
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
145-
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
146-
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
147-
-v /etc/ascend_install.info:/etc/ascend_install.info \
148-
-v /root/.cache:/root/.cache \
149-
-it $IMAGE bash
150-
```
151-
152-
::::
153-
::::{tab-item} A3 series
154-
155-
```{code-block} bash
156-
:substitutions:
157-
# Update the vllm-ascend image
158-
# openEuler:
159-
# export IMAGE=quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-openeuler-deepseek-v3.2-exp
160-
# Ubuntu:
161-
# export IMAGE=quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp
162-
export IMAGE=quay.nju.edu.cn/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp
163-
export NAME=vllm-ascend
164-
165-
# Run the container using the defined variables
166-
# Note if you are running bridge network with docker, Please expose available ports
167-
# for multiple nodes communication in advance
168-
docker run --rm \
169-
--name $NAME \
170-
--net=host \
171-
--shm-size=1g \
172-
--device /dev/davinci0 \
173-
--device /dev/davinci1 \
174-
--device /dev/davinci2 \
175-
--device /dev/davinci3 \
176-
--device /dev/davinci4 \
177-
--device /dev/davinci5 \
178-
--device /dev/davinci6 \
179-
--device /dev/davinci7 \
180-
--device /dev/davinci8 \
181-
--device /dev/davinci9 \
182-
--device /dev/davinci10 \
183-
--device /dev/davinci11 \
184-
--device /dev/davinci12 \
185-
--device /dev/davinci13 \
186-
--device /dev/davinci14 \
187-
--device /dev/davinci15 \
188-
--device /dev/davinci_manager \
189-
--device /dev/devmm_svm \
190-
--device /dev/hisi_hdc \
191-
-v /usr/local/dcmi:/usr/local/dcmi \
192-
-v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
193-
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
194-
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
195-
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
196-
-v /etc/ascend_install.info:/etc/ascend_install.info \
197-
-v /root/.cache:/root/.cache \
198-
-it $IMAGE bash
199-
```
200-
201-
::::
202-
:::::
203-
20471
:::::{tab-set}
20572
::::{tab-item} DeepSeek-V3.2-Exp A3 series
20673

0 commit comments

Comments
 (0)