-
-
Notifications
You must be signed in to change notification settings - Fork 110
Description
Is there an existing issue for this?
- I have searched the existing issues
Current behavior
This implementation is incorrect: #2421 (to read first) & #2422
- on receiving SIGTERM signal, set
readinessprobe to fail with 503, to tell the orchestrator to stop sending requests - Wait X seconds to be sure traffic stops being forwarded to the app by Kubernetes (should match the interval of the readiness probe + few seconds, to be sure the orchestrator is aware the pod should stop receive traffic),
- proceed to close the webserver (process last requests if there are still some long ones running)
- proceed to close database connections and others connections & shutdown the app
Minimum reproduction code
Load test your NestJS app running in a Kubernetes environment, and trigger a new deployment during this load test. You should notice a few failed requests.
Here is a simple example of load test you can run with k6:
cat << 'EOF' | k6 run -
import http from 'k6/http';
import { sleep } from 'k6';
export const options = {
scenarios: {
constant_request_rate: {
executor: 'constant-arrival-rate',
rate: 5, // 5 iterations per second
timeUnit: '1s', // 1 second
duration: '2m', // 2 minutes
preAllocatedVUs: 5, // Number of VUs to pre-allocate
maxVUs: 10, // Maximum number of VUs to allow if needed
},
},
};
export default function () {
http.get('https://your-endpoint.com/livez');
sleep(1);
}
EOFSteps to reproduce
No response
Expected behavior
The expected graceful shutdown behaviour from a production-ready NestJs app should be:
- on receiving SIGTERM signal,
setreadinessprobe to fail with 503, to tell the orchestrator to stop sending requests - Wait X seconds to be sure traffic stops being forwarded to the app by Kubernetes
- set
readinessprobe to fail with 503, to tell the orchestrator to stop sending requests - proceed to close the webserver (process last requests if there are still some long ones running)
- proceed to close database connections and others connections & shutdown the app
Therefore, if the loadbalancer is still sending a request before being aware the endpoint is removed, the requests won't we seen as failed with 502, but instead will still be processed and not lead to downtime during a rolling update.
Package version
latest
NestJS version
latest
Node.js version
latest
In which operating systems have you tested?
- macOS
- Windows
- Linux
Other
Resources that explains why the few seconds sleep is necessary:
https://learnk8s.io/graceful-shutdown
In the meantime, simply setting a sleep to 0s in Terminus, and adding a lifecycle preStop hook to sleep X sec is enough to fix the behaviour.

