Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
I have seen several issues/question on e.g StackOverflow regarding "gracefull shutdown" in FastAPI/uvicorn, and I had the same issue. While we have the
timeout_graceful_shutdown
there is a a problem in e.g Kubernetes. When we are scaling number of pods/instances down, we are sending aSIGTERM
to the pod, but the load-balancer keep sending requests until thehealth/ready
does not respond status200
. Say the load-balancer pingshealth/ready
every 5 seconds until it gets a non-200-status, then the app could still get requests for 5 seconds after theSIGTERM
has been received meaning that all requests in that 5 sec interval would fail.By adding a delayed shutdown we simply delay the shutdown of the app by a given number of seconds, while setting a flag in the serving app (
uvicorn_shutdown_triggered
). The app can then use this property to check, if a health-point should return 200 or something else to notify the load-balancer that the app is currently under shutdown, while still accepting requests. After thedelayed_shutdown
time has passed the app shuts down as it normally would.Checklist