-
Notifications
You must be signed in to change notification settings - Fork 543
[Bugfix] fix env variable in dbo #1284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
583d280 to
0cc6f23
Compare
|
it's good to add the e2e as well. https://github.com/vllm-project/vllm-ascend/blob/main/tests/e2e/multicard/test_torchair_graph_mode.py with the env enabled |
| "DeepseekV2ForCausalLM", | ||
| "vllm_ascend.models.deepseek_dbo:CustomDeepseekDBOForCausalLM") | ||
|
|
||
| ModelRegistry.register_model( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a ut here to check DeepseekV3ForCausalLM is registered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a ut here to check DeepseekV3ForCausalLM is registered
Thanks for your review!
Is it appropriate to add a check for the module_name and class_name of the registered dbo model here?
Thanks for your advice! We have added an e2e test for using dbo in deepseekV3 model here. |
|
update https://github.com/vllm-project/vllm-ascend/blob/main/.github/workflows/vllm_ascend_test.yaml#L334-L338 to let the test run by CI actually |
| dtype = "half" | ||
| sampling_params = SamplingParams(max_tokens=100, temperature=0.0) | ||
| with VllmRunner( | ||
| "deepseek-ai/DeepSeek-V3-Lite-base-latest-w8a8-dynamic", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this model? I can't found any weight named this in public. I think you can use vllm-ascend/DeepSeek-V3-Pruning instead
|
it seems that the DeepSeek-V3-Pruning models has the following config: first_k_dense_replace = 3, num_hidden_layers = 2; |
vllm-project/vllm@799397e#diff-52422ec3f789d0cddb8ac608c8ab33c74839612d5b05fbc64fb76ebffe0889d2 pooling_params |
|
please rebase asap to make ci happy |
Signed-off-by: zhuohuan <zxdu1997@gmail.com>
Signed-off-by: zhuohuan <zxdu1997@gmail.com>
Signed-off-by: zhuohuan <zxdu1997@gmail.com>
Signed-off-by: zhuohuan <zxdu1997@gmail.com>
4401b17 to
91bd15c
Compare
|
this PR has been merged to 0.9.1-dev already. Let's merge this to main as well. |
### What this PR does / why we need it? Fix env variable in dbo to enable dbo in DeepSeek-V3 model. Besides, we have fixed an known issue in deepseek-dbo. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? This patch can be tested with newly added e2e tests: [tests/multicard/test_offline_inference_distributed.py](https://github.com/vllm-project/vllm-ascend/pull/1285/files#diff-7cd2e6b1bda6b8ad1bedb3276971fe7064aeae4dc0efd41c301c4ede2158c57e). It can be verified with pytest. --------- Signed-off-by: zhuohuan <zxdu1997@gmail.com>
### What this PR does / why we need it? Fix env variable in dbo to enable dbo in DeepSeek-V3 model. Besides, we have fixed an known issue in deepseek-dbo. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? This patch can be tested with newly added e2e tests: [tests/multicard/test_offline_inference_distributed.py](https://github.com/vllm-project/vllm-ascend/pull/1285/files#diff-7cd2e6b1bda6b8ad1bedb3276971fe7064aeae4dc0efd41c301c4ede2158c57e). It can be verified with pytest. --------- Signed-off-by: zhuohuan <zxdu1997@gmail.com>
What this PR does / why we need it?
Fix env variable in dbo to enable dbo in DeepSeek-V3 model. Besides, we have fixed an known issue in deepseek-dbo.
Does this PR introduce any user-facing change?
How was this patch tested?
This patch can be tested with newly added e2e tests: tests/multicard/test_offline_inference_distributed.py.
It can be verified with pytest.