You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.
Model Series
Qwen2.5
What are the models used?
qwen2.5-72B
What is the scenario where the problem happened?
抢占式实例部署qwen2.5-72B成功,调用失败
Is this badcase known and can it be solved using avaiable techniques?
Information about environment
部署指令:vllm serve /home/Qwen2.5/Qwen2.5-72B-Instruct --port 6666 --host 0.0.0.0 --tensor-parallel-size 4 --served-model-name Qwen2.5-72B --enforce-eager
部署成功但是调用失败截图
应该是和MQLLMEngine交互数据超时了,但是不知道解决办法
Description
Steps to reproduce
This happens to Qwen2.5-xB-Instruct-xxx and xxx.
The badcase can be reproduced with the following steps:
The following example input & output can be used:
Expected results
The results are expected to be ...
Attempts to fix
I have tried several ways to fix this, including:
Anything else helpful for investigation
I find that this problem also happens to ...
The text was updated successfully, but these errors were encountered: