-
Notifications
You must be signed in to change notification settings - Fork 562
[CI/UT][Refactor] move e2e spec decode and deepseek acc test to per pr #1136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| vllm_model.generate_greedy(example_prompts, max_tokens) | ||
|
|
||
|
|
||
| def test_models_distributed_DeepSeek(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no need to run e2e functional ut as we already have acc ut
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, acc test belongs to long-term-test, it is not ran in every commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't take a long time as dataset gsmk is small. wydt? cc @Yikun
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
1 similar comment
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
| --ignore=tests/e2e/singlecard/long_term/spec_decode/e2e/test_v1_mtp_correctness.py | ||
| # ------------ spec decode e2e test on v1 ------------ # | ||
| VLLM_USE_MODELSCOPE=True pytest -sv tests/e2e/singlecard/long_term/spec_decode/e2e/test_v1_mtp_correctness.py | ||
| # TODO: revert me when test_v1_spec_decode.py::test_ngram_correctness is fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1136 +/- ##
===========================================
+ Coverage 27.39% 52.36% +24.96%
===========================================
Files 56 78 +22
Lines 6191 9631 +3440
===========================================
+ Hits 1696 5043 +3347
- Misses 4495 4588 +93
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
The generated results with and without dbo cannot be aligned. |
|
This pr is ready for review cc @Yikun @wangxiyuan |
eb46671 to
20bb139
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
d659d4c to
410650c
Compare
### What this PR does / why we need it? mla attention still using the gpu_input_batch's attr:`swap_states`, which will lead to an error `AttributeError: 'InputBatch' object has no attribute 'swap_states'` This PR fixed the mla input patch error ### How was this patch tested? will be tested by #1136 --------- Signed-off-by: wangli <wangli858794774@gmail.com>
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
### What this PR does / why we need it? mla attention still using the gpu_input_batch's attr:`swap_states`, which will lead to an error `AttributeError: 'InputBatch' object has no attribute 'swap_states'` This PR fixed the mla input patch error ### How was this patch tested? will be tested by vllm-project#1136 --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: ZhengWG <zwg0606@gmail.com>
### What this PR does / why we need it? mla attention still using the gpu_input_batch's attr:`swap_states`, which will lead to an error `AttributeError: 'InputBatch' object has no attribute 'swap_states'` This PR fixed the mla input patch error ### How was this patch tested? will be tested by vllm-project#1136 --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: ZhengWG <zwg0606@gmail.com>
* move e2e spec decode and deepseek acc test to per pr * move test_fused_moe_allgather_ep.py to e2e/multicard * remove e2e test on deepseek-v2-lite due to already test acc Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it? mla attention still using the gpu_input_batch's attr:`swap_states`, which will lead to an error `AttributeError: 'InputBatch' object has no attribute 'swap_states'` This PR fixed the mla input patch error ### How was this patch tested? will be tested by vllm-project#1136 --------- Signed-off-by: wangli <wangli858794774@gmail.com>
vllm-project#1136) ### What this PR does / why we need it? 1. run deepseek acc ut per pr --- multicard CI time increased by 9 min 2. run spec decode e2e test on v1 per pr --- singlecard CI time increased by 3 min (partly is disabled due to not work now) ~~3. align the output of whether dbo is enabled or not~~ The generated results with and without dbo cannot be aligned. https://github.com/vllm-project/vllm-ascend/actions/runs/15822900528/job/44600029405?pr=1136 4. skip V0 mtp test due to failure in https://github.com/vllm-project/vllm-ascend/actions/runs/16012172833/job/45171988816 5. fix some version conflicts ### How was this patch tested? CI passed with new added test. --------- Signed-off-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it? mla attention still using the gpu_input_batch's attr:`swap_states`, which will lead to an error `AttributeError: 'InputBatch' object has no attribute 'swap_states'` This PR fixed the mla input patch error ### How was this patch tested? will be tested by vllm-project#1136 --------- Signed-off-by: wangli <wangli858794774@gmail.com>
vllm-project#1136) ### What this PR does / why we need it? 1. run deepseek acc ut per pr --- multicard CI time increased by 9 min 2. run spec decode e2e test on v1 per pr --- singlecard CI time increased by 3 min (partly is disabled due to not work now) ~~3. align the output of whether dbo is enabled or not~~ The generated results with and without dbo cannot be aligned. https://github.com/vllm-project/vllm-ascend/actions/runs/15822900528/job/44600029405?pr=1136 4. skip V0 mtp test due to failure in https://github.com/vllm-project/vllm-ascend/actions/runs/16012172833/job/45171988816 5. fix some version conflicts ### How was this patch tested? CI passed with new added test. --------- Signed-off-by: MengqingCao <cmq0113@163.com>
What this PR does / why we need it?
3. align the output of whether dbo is enabled or notThe generated results with and without dbo cannot be aligned.
https://github.com/vllm-project/vllm-ascend/actions/runs/15822900528/job/44600029405?pr=1136
How was this patch tested?
CI passed with new added test.