Skip to content

Commit ad12e8b

Browse files
dongbo910220claude
authored andcommitted
feat(api): Return 503 on /health when engine is dead (vllm-project#24897)
Signed-off-by: dongbo910220 <1275604947@qq.com> Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: charlifu <charlifu@amd.com>
1 parent af8824a commit ad12e8b

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

vllm/entrypoints/openai/api_server.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,7 @@
103103
from vllm.usage.usage_lib import UsageContext
104104
from vllm.utils import (Device, FlexibleArgumentParser, decorate_logs,
105105
is_valid_ipv6_address, set_ulimit)
106+
from vllm.v1.engine.exceptions import EngineDeadError
106107
from vllm.v1.metrics.prometheus import get_prometheus_registry
107108
from vllm.version import __version__ as VLLM_VERSION
108109

@@ -351,8 +352,11 @@ def engine_client(request: Request) -> EngineClient:
351352
@router.get("/health", response_class=Response)
352353
async def health(raw_request: Request) -> Response:
353354
"""Health check."""
354-
await engine_client(raw_request).check_health()
355-
return Response(status_code=200)
355+
try:
356+
await engine_client(raw_request).check_health()
357+
return Response(status_code=200)
358+
except EngineDeadError:
359+
return Response(status_code=503)
356360

357361

358362
@router.get("/load")

0 commit comments

Comments
 (0)