-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: v0.4.3 AsyncEngineDeadError #5443
Comments
update: this error message also shows when using 0.5.0 version of vllm:
|
How do you solve this problem, I have the same problem as you |
Ref. #5732 |
How do you solve this problem, I have the same problem as you |
I'm looking into the Could you please share the following so that I can reproduce the issue:
|
I launched vllm inference server on kubernets, here is my launch command
```
containers:
- name: qw2-vllm #自定义名字
image: harbor.myharbor.cn/myrepo/vllm-openai:v0.5.0 #推理镜像的名字
ports:
- containerPort: 8000
# securityContext:
# privileged: true
# command: ["/bin/sh","-c", "sleep 1d"] #推理启动的脚本
command: ["/bin/bash", "-c"] #推理启动的脚本
args: [
# "sudo sed -i '175,+2s/\"dns.google\"/\"8.8.8.8\"/g'
/workspace/vllm/utils.py && \
"nvidia-smi;python3 -m vllm.entrypoints.openai.api_server \
--host 0.0.0.0 \
--model /fl/nlp/common/plms/qwen2/Qwen2-72B-Instruct \
--trust-remote-code \
--enforce-eager \
--max-model-len 32768 \
--gpu-memory-utilization 0.9 \
--served-model-name qwen2-72bc \
--tensor-parallel-size 8"
]
resources:
limits:
nvidia.com/gpu: 8 #必须填写,每个节点使用的 GPU 的数量
```
since this bug seems not being triggered by specific cases, I am unable to
provide such cases, sorry
Robert Shaw ***@***.***> 于2024年6月27日周四 19:54写道:
… I'm looking into the AsyncEngineDeadError issues under #5901
<#5901>
Could you please share the following so that I can reproduce the issue:
- The server launch command used
- A sample request that causes the error
—
Reply to this email directly, view it on GitHub
<#5443 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFST5AJGBRLECCBNDHC3V3DZJP4PHAVCNFSM6AAAAABJFZKTTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJUGQ4DEMBUGM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
Your current environment
🐛 Describe the bug
I use vllm/vllm-openai:v0.4.3 on k8s to deploy a 14b model with config as follows:
after a short period of normal run, this happened:
I saw the exception message said "This should never happen" but this happened.
Anyone with the same issue? Is there any solutions for this?
The text was updated successfully, but these errors were encountered: