[Doc] Refactor the DeepSeek-V3.2-Exp tutorial.

menogrey · menogrey · commit 0bbdca0cc928 · 2025-10-30T18:03:07.000+08:00
Signed-off-by: menogrey &lt;1299267905@qq.com&gt;
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -80,6 +80,9 @@
     'ci_vllm_version': 'v0.11.0rc3',
 }
 
+# For cross-file header anchors
+myst_heading_anchors = 5
+
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
 
diff --git a/docs/source/installation.md b/docs/source/installation.md
@@ -20,7 +20,7 @@ There are two installation methods:
 - **Using pip**: first prepare env manually or via CANN image, then install `vllm-ascend` using pip.
 - **Using docker**: use the `vllm-ascend` pre-built docker image directly.
 
-## Configure a new environment
+## Configure a Ascend CANN environment
 
 Before installation, you need to make sure firmware/driver and CANN are installed correctly, refer to [Ascend Environment Setup Guide](https://ascend.github.io/docs/sources/ascend/quick_install.html) for more details.
 
@@ -109,14 +109,7 @@ No more extra step if you are using `vllm-ascend` prebuilt Docker image.
 
 Once it is done, you can start to set up `vllm` and `vllm-ascend`.
 
-## Setup vllm and vllm-ascend
-
-:::::{tab-set}
-:sync-group: install
-
-::::{tab-item} Using pip
-:selected:
-:sync: pip
+## Set up using Python
 
 First install system dependencies and configure pip mirror:
 
@@ -181,12 +174,19 @@ To build custom operators, gcc/g++ higher than 8 and c++ 17 or higher is require
 If you encounter other problems during compiling, it is probably because unexpected compiler is being used, you may export `CXX_COMPILER` and `C_COMPILER` in environment to specify your g++ and gcc locations before compiling.
 ```
 
-::::
+## Set up using Docker
 
-::::{tab-item} Using docker
-:sync: docker
+`vllm-ascend` offers Docker images for deployment. You can just pull the **prebuilt image** from the image repository [ascend/vllm-ascend](https://quay.io/repository/ascend/vllm-ascend?tab=tags) and run it with bash.
 
-You can just pull the **prebuilt image** and run it with bash.
+Supported images as following.
+| image name | Hardware | OS |
+|-|-|-|
+| image-tag | Atlas A2 | Ubuntu |
+| image-tag-openeuler | Atlas A2 | openEuler |
+| image-tag-a3 | Atlas A3 | Ubuntu |
+| image-tag-a3-openeuler | Atlas A3 | openEuler |
+| image-tag-310p | Atlas 300I | Ubuntu |
+| image-tag-310p-openeuler | Atlas 300I | openEuler |
 
 :::{dropdown} Click here to see "Build from Dockerfile"
 or build IMAGE from **source code**:
@@ -202,18 +202,27 @@ docker build -t vllm-ascend-dev-image:latest -f ./Dockerfile .
 ```{code-block} bash
    :substitutions:
 
-# Update DEVICE according to your device (/dev/davinci[0-7])
-export DEVICE=/dev/davinci7
-# Update the vllm-ascend image
+# Update --device according to your device (Atlas A2: /dev/davinci[0-7] Atlas A3:/dev/davinci[0-15]).
+# Update the vllm-ascend image according to your environment.
+# Note you should download the weight to /root/.cache in advance.
 export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|
 docker run --rm \
     --name vllm-ascend-env \
     --shm-size=1g \
-    --device $DEVICE \
+    --net=host \
+    --device /dev/davinci0 \
+    --device /dev/davinci1 \
+    --device /dev/davinci2 \
+    --device /dev/davinci3 \
+    --device /dev/davinci4 \
+    --device /dev/davinci5 \
+    --device /dev/davinci6 \
+    --device /dev/davinci7 \
     --device /dev/davinci_manager \
     --device /dev/devmm_svm \
     --device /dev/hisi_hdc \
     -v /usr/local/dcmi:/usr/local/dcmi \
+    -v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
     -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
     -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
     -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
@@ -223,9 +232,7 @@ docker run --rm \
 ```
 
 The default workdir is `/workspace`, vLLM and vLLM Ascend code are placed in `/vllm-workspace` and installed in [development mode](https://setuptools.pypa.io/en/latest/userguide/development_mode.html) (`pip install -e`) to help developer immediately take place changes without requiring a new installation.
-::::
 
-:::::
 
 ## Extra information
 
diff --git a/docs/source/tutorials/DeepSeek-V3.2-Exp.md b/docs/source/tutorials/DeepSeek-V3.2-Exp.md
@@ -10,9 +10,9 @@ Only machines with AArch64 are supported currently. x86 will be supported soon.
 
 ## Supported Features
 
-Refer to [](../user_guide/support_matrix/supported_models.md) to get the model's detail.
+Refer to [supported models](../user_guide/support_matrix/supported_models.md) to get the model's detail.
 
-Refer to [](../user_guide/support_matrix/supported_features.md) to get the supported features.
+Refer to [supported features](../user_guide/support_matrix/supported_features.md) to get the supported features.
 
 ## Environment
 
@@ -21,64 +21,25 @@ Refer to [](../user_guide/support_matrix/supported_features.md) to get the suppo
 - `DeepSeek-V3.2-Exp`: require 2 Atlas 800 A3 (64G × 16) nodes or 4 Atlas 800 A2 (64G × 8). [Model weight link](https://modelers.cn/models/Modelers_Park/DeepSeek-V3.2-Exp-BF16)
 - `DeepSeek-V3.2-Exp-w8a8`: require 1 Atlas 800 A3 (64G × 16) node or 2 Atlas 800 A2 (64G × 8). [Model weight link](https://modelers.cn/models/Modelers_Park/DeepSeek-V3.2-Exp-w8a8)
 
+
+### Verify Multi-node Communication(Optional)
+
+If you want to deploy multi-node environment, you need to verify multi-node communication according to [verify multi-node communication environment](../installation.md#verify-multi-node-communication-environment).
+
 ### Installation
 
-Currently, we provide the all-in-one images `quay.io/ascend/vllm-ascend:v0.11.0rc0-deepseek-v3.2-exp`(for Atlas 800 A2) and `quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp`(for Atlas 800 A3). These images include CANN 8.2RC1 + [SparseFlashAttention/LightningIndexer](https://gitcode.com/cann/cann-recipes-infer/tree/master/ops/ascendc) + [MLAPO](https://github.com/vllm-project/vllm-ascend/pull/3226). You can also build your own image by referring to [link](https://github.com/vllm-project/vllm-ascend/issues/3278) and [](../installation.md).
+Currently, we provide the all-in-one images `quay.io/ascend/vllm-ascend:v0.11.0rc0-deepseek-v3.2-exp`(for Atlas 800 A2) and `quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp`(for Atlas 800 A3). These images include CANN 8.2RC1 + [SparseFlashAttention/LightningIndexer](https://gitcode.com/cann/cann-recipes-infer/tree/master/ops/ascendc) + [MLAPO](https://github.com/vllm-project/vllm-ascend/pull/3226). You can also build your own image by referring to [link](https://github.com/vllm-project/vllm-ascend/issues/3278).
+
+Refer to [installation](../installation.md#set-up-using-docker) to set up environment using Docker.
+
+If you want to deploy multi-node environment, you need to set up envrionment on each node.
 
 ## Deployment
 
 ### Single-node Deployment
 
 Only the quantized model `DeepSeek-V3.2-Exp-w8a8` can be deployed on 1 Atlas 800 A3.
 
-Run the following command to start the container in each node (You should download the weight to /root/.cache in advance):
-
-```{code-block} bash
-   :substitutions:
-# Update the vllm-ascend image
-# openEuler:
-# export IMAGE=quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-openeuler-deepseek-v3.2-exp
-# Ubuntu:
-# export IMAGE=quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp
-export IMAGE=quay.nju.edu.cn/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp
-export NAME=vllm-ascend
-
-# Run the container using the defined variables
-# Note if you are running bridge network with docker, Please expose available ports
-# for multiple nodes communication in advance
-docker run --rm \
---name $NAME \
---net=host \
---shm-size=1g \
---device /dev/davinci0 \
---device /dev/davinci1 \
---device /dev/davinci2 \
---device /dev/davinci3 \
---device /dev/davinci4 \
---device /dev/davinci5 \
---device /dev/davinci6 \
---device /dev/davinci7 \
---device /dev/davinci8 \
---device /dev/davinci9 \
---device /dev/davinci10 \
---device /dev/davinci11 \
---device /dev/davinci12 \
---device /dev/davinci13 \
---device /dev/davinci14 \
---device /dev/davinci15 \
---device /dev/davinci_manager \
---device /dev/devmm_svm \
---device /dev/hisi_hdc \
--v /usr/local/dcmi:/usr/local/dcmi \
--v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
--v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
--v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
--v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
--v /etc/ascend_install.info:/etc/ascend_install.info \
--v /root/.cache:/root/.cache \
--it $IMAGE bash
-```
-
 Run the following script to execute online inference.
 
 ```shell
@@ -107,100 +68,6 @@ vllm serve vllm-ascend/DeepSeek-V3.2-Exp-W8A8 \
 - `DeepSeek-V3.2-Exp`: require 2 Atlas 800 A3 (64G × 16) nodes or 4 Atlas 800 A2 (64G × 8).
 - `DeepSeek-V3.2-Exp-w8a8`: require 2 Atlas 800 A2 (64G × 8).
 
-Firstly, verify multi-node communication environment. [verify multi-node communication environment](https://vllm-ascend.readthedocs.io/en/latest/installation.html#verify-multi-node-communication-environment)
-
-Then run the following command to start the container in each node (You should download the weight to /root/.cache in advance):
-
-:::::{tab-set}
-::::{tab-item} A2 series
-
-```{code-block} bash
-   :substitutions:
-# Update the vllm-ascend image
-# export IMAGE=quay.io/ascend/vllm-ascend:v0.11.0rc0-deepseek-v3.2-exp
-export IMAGE=quay.nju.edu.cn/ascend/vllm-ascend:v0.11.0rc0-deepseek-v3.2-exp
-export NAME=vllm-ascend
-
-# Run the container using the defined variables
-# Note if you are running bridge network with docker, Please expose available ports
-# for multiple nodes communication in advance
-docker run --rm \
---name $NAME \
---net=host \
---shm-size=1g \
---device /dev/davinci0 \
---device /dev/davinci1 \
---device /dev/davinci2 \
---device /dev/davinci3 \
---device /dev/davinci4 \
---device /dev/davinci5 \
---device /dev/davinci6 \
---device /dev/davinci7 \
---device /dev/davinci_manager \
---device /dev/devmm_svm \
---device /dev/hisi_hdc \
--v /usr/local/dcmi:/usr/local/dcmi \
--v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
--v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
--v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
--v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
--v /etc/ascend_install.info:/etc/ascend_install.info \
--v /root/.cache:/root/.cache \
--it $IMAGE bash
-```
-
-::::
-::::{tab-item} A3 series
-
-```{code-block} bash
-   :substitutions:
-# Update the vllm-ascend image
-# openEuler:
-# export IMAGE=quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-openeuler-deepseek-v3.2-exp
-# Ubuntu:
-# export IMAGE=quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp
-export IMAGE=quay.nju.edu.cn/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp
-export NAME=vllm-ascend
-
-# Run the container using the defined variables
-# Note if you are running bridge network with docker, Please expose available ports
-# for multiple nodes communication in advance
-docker run --rm \
---name $NAME \
---net=host \
---shm-size=1g \
---device /dev/davinci0 \
---device /dev/davinci1 \
---device /dev/davinci2 \
---device /dev/davinci3 \
---device /dev/davinci4 \
---device /dev/davinci5 \
---device /dev/davinci6 \
---device /dev/davinci7 \
---device /dev/davinci8 \
---device /dev/davinci9 \
---device /dev/davinci10 \
---device /dev/davinci11 \
---device /dev/davinci12 \
---device /dev/davinci13 \
---device /dev/davinci14 \
---device /dev/davinci15 \
---device /dev/davinci_manager \
---device /dev/devmm_svm \
---device /dev/hisi_hdc \
--v /usr/local/dcmi:/usr/local/dcmi \
--v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
--v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
--v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
--v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
--v /etc/ascend_install.info:/etc/ascend_install.info \
--v /root/.cache:/root/.cache \
--it $IMAGE bash
-```
-
-::::
-:::::
-
 :::::{tab-set}
 ::::{tab-item} DeepSeek-V3.2-Exp A3 series
 

Original file line number	Diff line number	Diff line change
`@@ -80,6 +80,9 @@`
`80`	`80`	`'ci_vllm_version': 'v0.11.0rc3',`
`81`	`81`	`}`
`82`	`82`
	`83`	`+# For cross-file header anchors`
	`84`	`+myst_heading_anchors = 5`
	`85`	`+`
`83`	`86`	`# Add any paths that contain templates here, relative to this directory.`
`84`	`87`	`templates_path = ['_templates']`
`85`	`88`