Skip to content

Commit 6d95489

Browse files
committed
[Doc] Refactor the DeepSeek-V3.2-Exp tutorial.
Signed-off-by: menogrey <1299267905@qq.com>
1 parent 789ba4c commit 6d95489

File tree

4 files changed

+286
-89
lines changed

4 files changed

+286
-89
lines changed

docs/source/installation.md

Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -287,3 +287,181 @@ Prompt: 'The president of the United States is', Generated text: ' a very import
287287
Prompt: 'The capital of France is', Generated text: ' Paris. The oldest part of the city is Saint-Germain-des-Pr'
288288
Prompt: 'The future of AI is', Generated text: ' not bright\n\nThere is no doubt that the evolution of AI will have a huge'
289289
```
290+
291+
## Multi-node Deployment
292+
### Verify Multi-Node Communication Environment
293+
294+
#### Physical Layer Requirements:
295+
296+
- The physical machines must be located on the same WLAN, with network connectivity.
297+
- All NPUs are connected with optical modules, and the connection status must be normal.
298+
299+
#### Verification Process:
300+
301+
Execute the following commands on each node in sequence. The results must all be `success` and the status must be `UP`:
302+
303+
:::::{tab-set}
304+
::::{tab-item} A2 series
305+
306+
```bash
307+
# Check the remote switch ports
308+
for i in {0..7}; do hccn_tool -i $i -lldp -g | grep Ifname; done
309+
# Get the link status of the Ethernet ports (UP or DOWN)
310+
for i in {0..7}; do hccn_tool -i $i -link -g ; done
311+
# Check the network health status
312+
for i in {0..7}; do hccn_tool -i $i -net_health -g ; done
313+
# View the network detected IP configuration
314+
for i in {0..7}; do hccn_tool -i $i -netdetect -g ; done
315+
# View gateway configuration
316+
for i in {0..7}; do hccn_tool -i $i -gateway -g ; done
317+
# View NPU network configuration
318+
cat /etc/hccn.conf
319+
```
320+
321+
::::
322+
::::{tab-item} A3 series
323+
324+
```bash
325+
# Check the remote switch ports
326+
for i in {0..15}; do hccn_tool -i $i -lldp -g | grep Ifname; done
327+
# Get the link status of the Ethernet ports (UP or DOWN)
328+
for i in {0..15}; do hccn_tool -i $i -link -g ; done
329+
# Check the network health status
330+
for i in {0..15}; do hccn_tool -i $i -net_health -g ; done
331+
# View the network detected IP configuration
332+
for i in {0..15}; do hccn_tool -i $i -netdetect -g ; done
333+
# View gateway configuration
334+
for i in {0..15}; do hccn_tool -i $i -gateway -g ; done
335+
# View NPU network configuration
336+
cat /etc/hccn.conf
337+
```
338+
339+
::::
340+
:::::
341+
342+
#### NPU Interconnect Verification:
343+
##### 1. Get NPU IP Addresses
344+
:::::{tab-set}
345+
::::{tab-item} A2 series
346+
347+
```bash
348+
for i in {0..7}; do hccn_tool -i $i -ip -g | grep ipaddr; done
349+
```
350+
351+
::::
352+
::::{tab-item} A3 series
353+
354+
```bash
355+
for i in {0..15}; do hccn_tool -i $i -ip -g | grep ipaddr; done
356+
```
357+
358+
::::
359+
:::::
360+
361+
##### 2. Cross-Node PING Test
362+
363+
```bash
364+
# Execute on the target node (replace with actual IP)
365+
hccn_tool -i 0 -ping -g address x.x.x.x
366+
```
367+
368+
### Run Container In Each Node
369+
370+
Using vLLM-ascend official container is more efficient to run multi-node environment.
371+
372+
Run the following command to start the container in each node (You should download the weight to /root/.cache in advance):
373+
374+
:::::{tab-set}
375+
::::{tab-item} A2 series
376+
377+
```{code-block} bash
378+
:substitutions:
379+
# Update the vllm-ascend image
380+
# openEuler:
381+
# export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|-openeuler
382+
# Ubuntu:
383+
# export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|
384+
export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|
385+
386+
# Run the container using the defined variables
387+
# Note if you are running bridge network with docker, Please expose available ports
388+
# for multiple nodes communication in advance
389+
docker run --rm \
390+
--name vllm-ascend \
391+
--net=host \
392+
--shm-size=1g \
393+
--device /dev/davinci0 \
394+
--device /dev/davinci1 \
395+
--device /dev/davinci2 \
396+
--device /dev/davinci3 \
397+
--device /dev/davinci4 \
398+
--device /dev/davinci5 \
399+
--device /dev/davinci6 \
400+
--device /dev/davinci7 \
401+
--device /dev/davinci_manager \
402+
--device /dev/devmm_svm \
403+
--device /dev/hisi_hdc \
404+
-v /usr/local/dcmi:/usr/local/dcmi \
405+
-v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
406+
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
407+
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
408+
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
409+
-v /etc/ascend_install.info:/etc/ascend_install.info \
410+
-v /root/.cache:/root/.cache \
411+
-it $IMAGE bash
412+
```
413+
414+
::::
415+
::::{tab-item} A3 series
416+
417+
```{code-block} bash
418+
:substitutions:
419+
# Update the vllm-ascend image
420+
# openEuler:
421+
# export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|-a3-openeuler
422+
# Ubuntu:
423+
# export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|-a3
424+
export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|-a3
425+
426+
# Run the container using the defined variables
427+
# Note if you are running bridge network with docker, Please expose available ports
428+
# for multiple nodes communication in advance
429+
docker run --rm \
430+
--name vllm-ascend \
431+
--net=host \
432+
--shm-size=1g \
433+
--device /dev/davinci0 \
434+
--device /dev/davinci1 \
435+
--device /dev/davinci2 \
436+
--device /dev/davinci3 \
437+
--device /dev/davinci4 \
438+
--device /dev/davinci5 \
439+
--device /dev/davinci6 \
440+
--device /dev/davinci7 \
441+
--device /dev/davinci8 \
442+
--device /dev/davinci9 \
443+
--device /dev/davinci10 \
444+
--device /dev/davinci11 \
445+
--device /dev/davinci12 \
446+
--device /dev/davinci13 \
447+
--device /dev/davinci14 \
448+
--device /dev/davinci15 \
449+
--device /dev/davinci_manager \
450+
--device /dev/devmm_svm \
451+
--device /dev/hisi_hdc \
452+
-v /usr/local/dcmi:/usr/local/dcmi \
453+
-v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
454+
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
455+
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
456+
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
457+
-v /etc/ascend_install.info:/etc/ascend_install.info \
458+
-v /root/.cache:/root/.cache \
459+
-it $IMAGE bash
460+
```
461+
462+
::::
463+
:::::
464+
465+
### Verify installation
466+
467+
TODO

docs/source/quick_start.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@
2020
# Update DEVICE according to your device (/dev/davinci[0-7])
2121
export DEVICE=/dev/davinci0
2222
# Update the vllm-ascend image
23+
# Atlas A2:
24+
# export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|
25+
# Atlas A3:
26+
# export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|-a3
2327
export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|
2428
docker run --rm \
2529
--name vllm-ascend \
@@ -50,6 +54,10 @@ apt-get update -y && apt-get install -y curl
5054
# Update DEVICE according to your device (/dev/davinci[0-7])
5155
export DEVICE=/dev/davinci0
5256
# Update the vllm-ascend image
57+
# Atlas A2:
58+
# export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|-openeuler
59+
# Atlas A3:
60+
# export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|-a3-openeuler
5361
export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|-openeuler
5462
docker run --rm \
5563
--name vllm-ascend \

0 commit comments

Comments
 (0)