You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Startup/cleanup handlers are not resource-related, so they cannot rely on the Kubernetes watch-streams to be triggered. In addition, they should happen before & after the reactor actually starts/stops working with the resources.
This PR adds the @kopf.on.startup & @kopf.on.cleanup handlers.
The @kopf.on.startup handlers are executed on the operator startup. If one of the handlers fails, the operator does not start.
The @kopf.on.cleanup handlers are executed in the end after a stop signal is received or a stop-flag is set. The cleanup handlers are not guaranted to be executed at all -- e.g. if the process is SIGKILL'ed or crashes. In all normal shutdown sequences, it is attempted to be executed in full.
As a side goal, this PR introduces the operator activities in general, of which starup/cleanup handlers are only specific cases. The activities will be used for (re-)authentication later.
See the list of commits for detailed step-by-step changes.
Types of Changes
New feature (non-breaking change which adds functionality)
Refactor/improvements
Review
List of tasks the reviewer must do to review the PR
It may be worth mentioning in the documentation that you have only terminationGracePeriodSeconds (by default 30) to perform cleanup before you get killed. So if cleanup is expected to take longer than that, users of this feature might want to change terminationGracePeriodSeconds to something a bit more generous.
dneuhaeuser-zalando Can you please remind me, from where does terminationGracePeriodSeconds=30 come? There is no graceful termination timeout by default, as far as I remember (but I could forget).
If the operator is terminated somehow normally (SIGTERM, or a stop-flag set, or just cancelled via operator_task.cancel()), it has unlimited time to finish, incl. the cleanup handlers.
There are few exceptions with timeouts:
If the operator's task is cancelled during the shutdown, then it will exit nearly immediately — disgracefully.
Once the core tasks of the operator are finished (with unlimited timeout), then the remaining tasks (probably spawned by the handlers, or the slow handlers themselves) will have 5 seconds before force-killing. But as long as the core tasks are running (incl. the cleanup activities), there is no time limit.
Description
Startup/cleanup handlers are not resource-related, so they cannot rely on the Kubernetes watch-streams to be triggered. In addition, they should happen before & after the reactor actually starts/stops working with the resources.
This PR adds the
@kopf.on.startup
&@kopf.on.cleanup
handlers.The
@kopf.on.startup
handlers are executed on the operator startup. If one of the handlers fails, the operator does not start.The
@kopf.on.cleanup
handlers are executed in the end after a stop signal is received or a stop-flag is set. The cleanup handlers are not guaranted to be executed at all -- e.g. if the process is SIGKILL'ed or crashes. In all normal shutdown sequences, it is attempted to be executed in full.As a side goal, this PR introduces the operator activities in general, of which starup/cleanup handlers are only specific cases. The activities will be used for (re-)authentication later.
See the list of commits for detailed step-by-step changes.
Types of Changes
Review
List of tasks the reviewer must do to review the PR
It may be worth mentioning in the documentation that you have only
terminationGracePeriodSeconds
(by default 30) to perform cleanup before you get killed. So if cleanup is expected to take longer than that, users of this feature might want to changeterminationGracePeriodSeconds
to something a bit more generous.dneuhaeuser-zalando Can you please remind me, from where does
terminationGracePeriodSeconds=30
come? There is no graceful termination timeout by default, as far as I remember (but I could forget).If the operator is terminated somehow normally (SIGTERM, or a stop-flag set, or just cancelled via
operator_task.cancel()
), it has unlimited time to finish, incl. the cleanup handlers.There are few exceptions with timeouts:
Maybe I misunderstood something but as far as I understand a cleanup handler would be called, if the operator pod gets a TERM signal.
When you get such a signal from Kubernetes, Kubernetes will wait for the termination grace period and then terminate forcefully with a KILL signal. See https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods
dneuhaeuser-zalando Ah, I see. You mean the external timeouts. Totally makes sense. I will add a note now.
dneuhaeuser-zalando Thanks for noticing that aspect. Notes are added for both startup & cleanup timeouts.
The text was updated successfully, but these errors were encountered: