You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -22,21 +23,21 @@ Below series are NOT supported yet:
22
23
23
24
From a technical view, vllm-ascend support would be possible if the torch-npu is supported. Otherwise, we have to implement it by using custom ops. We are also welcome to join us to improve together.
24
25
25
-
26
26
### 2. How to get our docker containers?
27
27
28
28
You can get our containers at `Quay.io`, e.g., [<u>vllm-ascend</u>](https://quay.io/repository/ascend/vllm-ascend?tab=tags) and [<u>cann</u>](https://quay.io/repository/ascend/cann?tab=tags).
29
29
30
30
If you are in China, you can use `daocloud` to accelerate your downloading:
31
31
32
32
```bash
33
-
TAG=vllm-ascend released tag you want to pull(eg: v0.7.3rc2)
Find more details [<u>here</u>](https://vllm-ascend.readthedocs.io/en/v0.7.3-dev/user_guide/supported_models.html).
40
+
Find more details [<u>here</u>](https://vllm-ascend.readthedocs.io/en/latest/user_guide/supported_models.html).
40
41
41
42
### 4. How to get in touch with our community?
42
43
@@ -77,24 +78,33 @@ vllm-ascend is a plugin for vllm. Basically, the version of vllm-ascend is the s
77
78
78
79
### 9. Does vllm-ascend support Prefill Disaggregation feature?
79
80
80
-
Currently, only 1P1D is supported by vllm. For vllm-ascend, it'll be done by [this PR](https://github.com/vllm-project/vllm-ascend/pull/539). For NPND, vllm is not stable and fully supported yet. We will make it stable and supported by vllm-ascend in the future.
81
+
Currently, only 1P1D is supported by vllm. For vllm-ascend, it'll be done by [this PR](https://github.com/vllm-project/vllm-ascend/pull/432). For NPND, vllm is not stable and fully supported yet. We will make it stable and supported by vllm-ascend in the future.
81
82
82
83
### 10. Does vllm-ascend support quantization method?
83
84
84
-
Currently, w8a8 quantization is supporeted with the integration of vllm-ascend and mindie-turbo, please use `pip install vllm-ascend[mindie-turbo]`
85
+
Currently, w8a8 quantization is already supported by vllm-ascend originally on v0.8.4rc2 or heigher, If you're using vllm 0.7.3 version, w8a8 quantization is supporeted with the integration of vllm-ascend and mindie-turbo, please use `pip install vllm-ascend[mindie-turbo]`.
86
+
87
+
### 11. How to run w8a8 DeepSeek model?
88
+
89
+
Currently, w8a8 DeepSeek is working in process: [support AscendW8A8 quantization](https://github.com/vllm-project/vllm-ascend/pull/511)
85
90
86
-
### 11. There is not output in log when loading models using vllm-ascend, How to solve it?
91
+
Please run DeepSeek with BF16 now, follwing the [Multi-Node DeepSeek inferencing tutorail](https://vllm-ascend.readthedocs.io/en/main/tutorials/multi_node.html)
92
+
93
+
### 12. There is not output in log when loading models using vllm-ascend, How to solve it?
87
94
88
95
If you're using vllm 0.7.3 version, this is a known progress bar display issue in VLLM, which has been resolved in [this PR](https://github.com/vllm-project/vllm/pull/12428), please cherry-pick it locally by yourself. Otherwise, please fill up an issue.
89
96
90
-
### 12. How vllm-ascend is tested
97
+
### 13. How vllm-ascend is tested
91
98
92
99
vllm-ascend is tested by functional test, performance test and accuracy test.
93
100
94
-
-**Functional test**: we added CI, includes portion of vllm's native unit tests and vllm-ascend's own unit tests,on vllm-ascend's test, we test basic functionalityand [supported features](https://vllm-ascend.readthedocs.io/en/v0.7.3-dev/user_guide/suppoted_features.html)
101
+
-**Functional test**: we added CI, includes portion of vllm's native unit tests and vllm-ascend's own unit tests,on vllm-ascend's test, we test basic functionality、popular models availability and [supported features](https://vllm-ascend.readthedocs.io/en/latest/user_guide/suppoted_features.html) via e2e test
95
102
96
103
-**Performance test**: we provide [benchmark](https://github.com/vllm-project/vllm-ascend/tree/main/benchmarks) tools for end-to-end performance benchmark which can easily to re-route locally, we'll publish a perf website like [vllm](https://simon-mo-workspace.observablehq.cloud/vllm-dashboard-v0/perf) does to show the performance test results for each pull request
97
104
98
105
-**Accuracy test**: we're working on adding accuracy test to CI as well.
99
106
100
107
Finnall, for each release, we'll publish the performance test and accuracy test report in the future.
108
+
109
+
### 14. How to fix the error "InvalidVersion" when using vllm-ascend?
110
+
It's usually because you have installed an dev/editable version of vLLM package. In this case, we provide the env variable `VLLM_VERSION` to let users specify the version of vLLM package to use. Please set the env variable `VLLM_VERSION` to the version of vLLM package you have installed. The format of `VLLM_VERSION` should be `X.Y.Z`.
0 commit comments