You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/installation.md
+26-19Lines changed: 26 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@ There are two installation methods:
20
20
-**Using pip**: first prepare env manually or via CANN image, then install `vllm-ascend` using pip.
21
21
-**Using docker**: use the `vllm-ascend` pre-built docker image directly.
22
22
23
-
## Configure a new environment
23
+
## Configure a Ascend CANN environment
24
24
25
25
Before installation, you need to make sure firmware/driver and CANN are installed correctly, refer to [Ascend Environment Setup Guide](https://ascend.github.io/docs/sources/ascend/quick_install.html) for more details.
26
26
@@ -109,14 +109,7 @@ No more extra step if you are using `vllm-ascend` prebuilt Docker image.
109
109
110
110
Once it is done, you can start to set up `vllm` and `vllm-ascend`.
111
111
112
-
## Setup vllm and vllm-ascend
113
-
114
-
:::::{tab-set}
115
-
:sync-group: install
116
-
117
-
::::{tab-item} Using pip
118
-
:selected:
119
-
:sync: pip
112
+
## Set up using Python
120
113
121
114
First install system dependencies and configure pip mirror:
122
115
@@ -181,12 +174,19 @@ To build custom operators, gcc/g++ higher than 8 and c++ 17 or higher is require
181
174
If you encounter other problems during compiling, it is probably because unexpected compiler is being used, you may export `CXX_COMPILER` and `C_COMPILER` in environment to specify your g++ and gcc locations before compiling.
182
175
```
183
176
184
-
::::
177
+
## Set up using Docker
185
178
186
-
::::{tab-item} Using docker
187
-
:sync: docker
179
+
`vllm-ascend` offers Docker images for deployment. You can just pull the **prebuilt image** from the image repository [ascend/vllm-ascend](https://quay.io/repository/ascend/vllm-ascend?tab=tags) and run it with bash.
188
180
189
-
You can just pull the **prebuilt image** and run it with bash.
The default workdir is `/workspace`, vLLM and vLLM Ascend code are placed in `/vllm-workspace` and installed in [development mode](https://setuptools.pypa.io/en/latest/userguide/development_mode.html) (`pip install -e`) to help developer immediately take place changes without requiring a new installation.
If you want to deploy multi-node environment, you need to verify multi-node communication according to [verify multi-node communication environment](../installation.md#verify-multi-node-communication-environment).
28
+
24
29
### Installation
25
30
26
-
Currently, we provide the all-in-one images `quay.io/ascend/vllm-ascend:v0.11.0rc0-deepseek-v3.2-exp`(for Atlas 800 A2) and `quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp`(for Atlas 800 A3). These images include CANN 8.2RC1 + [SparseFlashAttention/LightningIndexer](https://gitcode.com/cann/cann-recipes-infer/tree/master/ops/ascendc) + [MLAPO](https://github.com/vllm-project/vllm-ascend/pull/3226). You can also build your own image by referring to [link](https://github.com/vllm-project/vllm-ascend/issues/3278) and [](../installation.md).
31
+
Currently, we provide the all-in-one images `quay.io/ascend/vllm-ascend:v0.11.0rc0-deepseek-v3.2-exp`(for Atlas 800 A2) and `quay.io/ascend/vllm-ascend:v0.11.0rc0-a3-deepseek-v3.2-exp`(for Atlas 800 A3). These images include CANN 8.2RC1 + [SparseFlashAttention/LightningIndexer](https://gitcode.com/cann/cann-recipes-infer/tree/master/ops/ascendc) + [MLAPO](https://github.com/vllm-project/vllm-ascend/pull/3226). You can also build your own image by referring to [link](https://github.com/vllm-project/vllm-ascend/issues/3278).
32
+
33
+
Refer to [installation](../installation.md#set-up-using-docker) to set up environment using Docker.
34
+
35
+
If you want to deploy multi-node environment, you need to set up envrionment on each node.
27
36
28
37
## Deployment
29
38
30
39
### Single-node Deployment
31
40
32
41
Only the quantized model `DeepSeek-V3.2-Exp-w8a8` can be deployed on 1 Atlas 800 A3.
33
42
34
-
Run the following command to start the container in each node (You should download the weight to /root/.cache in advance):
Firstly, verify multi-node communication environment. [verify multi-node communication environment](https://vllm-ascend.readthedocs.io/en/latest/installation.html#verify-multi-node-communication-environment)
111
-
112
-
Then run the following command to start the container in each node (You should download the weight to /root/.cache in advance):
0 commit comments