-
Notifications
You must be signed in to change notification settings - Fork 528
[4/N][CI/UT] Add Qwen2.5-VL and DeepSeek-V2-Lite test #499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
overall lgtm, thanks! |
Ok, and I'm downloading deepseek-v2-lite and qwen2.5vl-3b weight to the new storage, actions won't success until weights are ready |
5e76374 to
dd0a228
Compare
|
skip qwen2.5_vl test on v1 until issue solved |
|
Please add multicard test as well. such as qwen2.5 vl 32B |
59d86e2 to
16db37f
Compare
d77fa4f to
bfdc845
Compare
| - name: Run vllm-project/vllm-ascend test on V0 engine | ||
| env: | ||
| VLLM_USE_V1: 0 | ||
| VLLM_WORKER_MULTIPROC_METHOD: spawn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why V0 need this env?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I test distribute with Qwen2.5-VL-32B on v0, encountering errors RuntimeError: Cannot re-initialize NPU in forked subprocess. To use NPU with multiprocessing, you must use the 'spawn' start method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MengqingCao I don't think v0 need this change, please take a look. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think there are some multiprocessing problems on destributed inference tests. I think we can use spawn to avoid this issue, and I'm looking into this now, will fix it when finding a better resolution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After debugging the distributed case, I found that the mainly re-initialize conflict is caused by the child process forked by the main process process0.
- For example,
npu:1will be initialized in child processprocess1, and this process will turn tosleepingstatus when tearing down the first ut. - While the second distributed ut setting up,
process2, a new child process, also try to initializenpu:1, then the re-initialize error is raised. - Moreover,
process1cannot be killed clearly by hand. Thus I reconmend to usespawnhere to solve this issue.
@Potabk plz add this analysis as comment in this pr, thx!
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
### What this PR does / why we need it? Part of vllm-project#499 Add qwen2.5-vl test on single npu, v1 engine is excluded because qwen2.5-vl has some problems with v1 now, at the same time, this test can also make vllm-project#639 more credible Signed-off-by: wangli <wangli858794774@gmail.com>
### What this PR does / why we need it? Add deepseek-v2-lite test, part of #499 --------- Signed-off-by: wangli <wangli858794774@gmail.com>
|
Since all the sub pr has landed, we don't need this pr any more |
### What this PR does / why we need it? Add deepseek-v2-lite test, part of vllm-project#499 --------- Signed-off-by: wangli <wangli858794774@gmail.com>
### What this PR does / why we need it? Part of vllm-project#499 Add qwen2.5-vl test on single npu, v1 engine is excluded because qwen2.5-vl has some problems with v1 now, at the same time, this test can also make vllm-project#639 more credible Signed-off-by: wangli <wangli858794774@gmail.com>
### What this PR does / why we need it? Add deepseek-v2-lite test, part of vllm-project#499 --------- Signed-off-by: wangli <wangli858794774@gmail.com>
What this PR does / why we need it?
This is one step of CI for key features
Does this PR introduce any user-facing change?
How was this patch tested?