Skip to content

Conversation

@Gongdayao
Copy link

@Gongdayao Gongdayao commented Nov 29, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: Gongdayao <gongdayao@foxmail.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new tutorial for deploying the DeepSeek-R1 model. The tutorial is comprehensive, covering environment setup, deployment on A2 and A3 series hardware, functional verification, and performance/accuracy evaluation. However, I've found a few critical issues in the documentation that could prevent users from successfully following the steps. These include a typo in a command-line argument, incomplete installation instructions, incorrect markdown syntax, and inconsistent model naming. Addressing these issues will significantly improve the quality and usability of the tutorial.

--host 0.0.0.0 \
--port 8000 \
--data-parallel-size 4 \
--data-parallel-size_local 2 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a typo in the command-line argument --data-parallel-size_local. It should be --data-parallel-size-local (with hyphens instead of an underscore). This typo will cause the vllm serve command to fail.

Suggested change
--data-parallel-size_local 2 \
--data-parallel-size-local 2 \

## Introduction

DeepSeek-R1 is a high-performance Mixture-of-Experts (MoE) large language model developed by DeepSeek Company. It excels in complex logical reasoning, mathematical problem-solving, and code generation. By dynamically activating its expert networks, it delivers exceptional performance while maintaining computational efficiency. Building upon R1, DeepSeek-R1-W8A8 is a fully quantized version of the model. It employs 8-bit integer (INT8) quantization for both weights and activations, which significantly reduces the model's memory footprint and computational requirements, enabling more efficient deployment and application in resource-constrained environments.
This article takes the deepseek- R1-w8a8 version as an example to introduce the deployment of the R1 series models.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The model name deepseek- R1-w8a8 is used here (with a space and lowercase w). However, the vllm serve commands (e.g., line 88) and the official model download link use DeepSeek-R1-W8A8 with an uppercase W. This inconsistency is present throughout the document and can lead to 'file not found' errors on case-sensitive filesystems. Please use a consistent naming convention, preferably DeepSeek-R1-W8A8.


- Install `vllm-ascend` from source, refer to [installation](../installation.md).

- Install extra operator for supporting `DeepSeek-R1-w8a8`, refer to the above tab.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The instruction to 'refer to the above tab' for installing the extra operator is unclear as there is no other tab at this level providing these instructions. This leaves the user without a way to proceed with the source installation. Please provide the correct instructions or link to them.

Signed-off-by: Gongdayao <gongdayao@foxmail.com>
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 29, 2025
Signed-off-by: Gongdayao <gongdayao@foxmail.com>
Signed-off-by: Gongdayao <gongdayao@foxmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant