upgrade default ping endpoint #2231

lxning · 2023-04-14T23:12:11Z

🚀 The feature

on prod, ping endpoint is more useful if it can reflects each models healthy status. Upgrade the default ping behavior as following.

add parameter "maxRetryTimeoutInSec" (default: 5MIN) in model level config: the max maximum time window of recovering a dead backend worker.
a healthy worker can be in the state: WORKER_STARTED, WORKER_MODEL_LOADED, or WORKER_STOPPED within maxRetryTimeoutInMin window.
return 200 + message "healthy": for any model, the number of active workers is equal or larger than the configured minWorkers.
return 500 + message "unhealthy": for any model, the number of active workers is less than the configured minWorkers.

Motivation, pitch

existing ping endpoint only reflects server heartbeat. It always returns 200 with message such as "healthy", "Partial Healthy", or "Unhealthy"(see code). Here, "Partial Healthy" can be one of the scenarios:

case1 : one model has n (> 1) workers. m (< n) workers die.
case2: n models registered in a server. m (<n) model have partial or completely dead workers.

An inference request is routed to a "Partial Healthy", or "Unhealthy" server if load balancer is based on ping endpoint return code 200. In this case, the inference request will fail.

Alternatives

No response

Additional context

No response

dhanmeetsingh · 2023-04-15T18:54:37Z

Verify that the custom model archive you're trying to load is valid and correctly formatted. You can use the PyTorch Hub package to verify the model archive by running the following command:
python -m torch. hub.checkout_hash <MODEL_NAME> <MODEL_VERSION>

Replace <MODEL_NAME> and <MODEL_VERSION> with the name and version of your custom model. If the model archive is valid, the command should return a valid Git hash.

If the model archive is valid, check that the metadata and signature files are correctly formatted. You can use the
torch serve--show config command to view the configuration of your PyTorch Server instance, including the locations of the metadata and signature files. Ensure that these files exist and are correctly formatted.

If the metadata and signature files are correct, try updating to the latest version of PyTorch Serve to see if the issue has already been resolved. You can do this by running the following command:

pip install torchserve --upgrade

lxning self-assigned this Apr 20, 2023

lxning added the enhancement New feature or request label Apr 20, 2023

lxning added this to the v0.8.0 milestone Apr 20, 2023

lxning mentioned this issue Apr 21, 2023

update ping endpoint default behavior #2254

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

upgrade default ping endpoint #2231

upgrade default ping endpoint #2231

lxning commented Apr 14, 2023 •

edited

Loading

dhanmeetsingh commented Apr 15, 2023

upgrade default ping endpoint #2231

upgrade default ping endpoint #2231

Comments

lxning commented Apr 14, 2023 • edited Loading

🚀 The feature

Motivation, pitch

Alternatives

Additional context

dhanmeetsingh commented Apr 15, 2023

lxning commented Apr 14, 2023 •

edited

Loading