- Release Signoff Checklist
- Summary
- Motivation
- Proposal
- Design Details
- Production Readiness Review Questionnaire
- Implementation History
- Drawbacks
- Alternatives
- Infrastructure Needed (Optional)
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable
- (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
- (R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
- (R) Production readiness review completed
- (R) Production readiness review approved
- "Implementation History" section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website, for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
The sleep action for the PreStop container lifecycle hook was introduced in KEP 3960. It however doesn’t accept zero as a valid value for the sleep duration seconds. This KEP aims to add support for setting a value of zero with the sleep action of the PreStop hook.
Currently, trying to create a container with a PreStop lifecycle hook with sleep of 0 seconds will throw a validation error like so:
Invalid value: 0: must be greater than 0 and less than terminationGracePeriodSeconds (30)
The Sleep action is implemented with the time package from Go’s standard library. The time.After()
which is used to implement the sleep permits a zero sleep duration. A negative or a zero sleep duration will cause the function to return immediately and function like a no-op.
The implementation in KEP 3960 supports only non-zero values for the sleep duration. It is semantically correct to support a zero value for this field since time.After() also supports zero and negative durations. Negative values as well as zero have the same effect with time.After(), they both return immediately. We don’t need to support negative values since they have the same effect as setting the duration to zero.
A potential use case for this behaviour is when you need a PreStop hook to be defined for the validation of your resource, but don't really need to sleep as part of the PreStop hook. An example of this is described by a user here in the parent KEP. They add a PreStop sleep hook in via an admission webhoook by default if the PreStop is hook is not specified by the user. In order to opt-out from this, a no-op PreStop hook with a duration of zero seconds can be used.
- Update the validation for the Sleep action to allow zero as a valid sleep duration.
- Allow users to set a zero value for the sleep action in PreStop hooks to do a no-op.
- This KEP does not support adding negative values for the sleep duration.
- This KEP does not aim to provide a way to pause or delay pod termination indefinitely.
Introduce a PodLifecycleSleepActionAllowZero
feature gate which is disabled by default. When the feature gate is enabled, the validateSleepAction
method would allow values greater than or equal to zero as a valid sleep duration.
Since this update to the validation allows previously invalid values, care must be taken to support cluster downgrades safely. To accomplish this, the validation will distinguish between new resources and updates to existing resources:
- When the feature gate is disabled:
- (a) New resources will no longer allow setting zero as the sleep duration second for the PreStop hook. (no change to current validation)
- (b) Existing resources cannot be updated to have a sleep duration of zero seconds
- (c) Existing resources with a PreStop sleep duration set to zero will continue to run and use a sleep duration of zero seconds. These can be updated and the zero sleep duration would continue to work.
- When the feature gate is enabled:
- (c) New resources allow zero as a valid sleep duration.
- (d) Updates to existing resources will allow zero as a valid sleep duration.
The proposed change adds another layer to the validateSleepAction
function to allow zero as a valid sleep duration setting like shown:
-func validateSleepAction(sleep *core.SleepAction, gracePeriod *int64, fldPath *field.Path) field.ErrorList {
+func validateSleepAction(sleep *core.SleepAction, gracePeriod *int64, fldPath *field.Path, opts PodValidationOptions) field.ErrorList {
allErrors := field.ErrorList{}
// We allow gracePeriod to be nil here because the pod in which this SleepAction
// is defined might have an invalid grace period defined, and we don't want to
// flag another error here when the real problem will already be flagged.
- if gracePeriod != nil && sleep.Seconds <= 0 || sleep.Seconds > *gracePeriod {
- invalidStr := fmt.Sprintf("must be greater than 0 and less than terminationGracePeriodSeconds (%d)", *gracePeriod)
- allErrors = append(allErrors, field.Invalid(fldPath, sleep.Seconds, invalidStr))
+ if opts.AllowPodLifecycleSleepActionZeroValue {
+ if gracePeriod != nil && sleep.Seconds < 0 || sleep.Seconds > *gracePeriod {
+ invalidStr := fmt.Sprintf("must be non-negative and less than terminationGracePeriodSeconds (%d)", *gracePeriod)
+ allErrors = append(allErrors, field.Invalid(fldPath, sleep.Seconds, invalidStr))
+ }
+ } else {
+ if gracePeriod != nil && sleep.Seconds <= 0 || sleep.Seconds > *gracePeriod {
+ invalidStr := fmt.Sprintf("must be greater than 0 and less than terminationGracePeriodSeconds (%d). Please enable PodLifecycleSleepActionAllowZero feature gate if you need a sleep of zero duration.", *gracePeriod)
+ allErrors = append(allErrors, field.Invalid(fldPath, sleep.Seconds, invalidStr))
+ }
}
return allErrors
}
Currently, the kubelet accepts 0
as a valid duration. There is no validation done at the kubelet level. All the validation for the duration itself is done at the kube-apiserver. The runSleepHandler in the kubelet uses the time.After()
function from the time package, which supports a 0
duration input. time.After
also accepts negative values which are also returned immediately similar to zero. We don't support negative values however.
See the entire code changes in the WIP PR: kubernetes/kubernetes#127094
As a Kubernetes user, I want to to be able to have a PreStop hook defined in my spec without needing to sleep during the execution of the PreStop hook. This no-op behaviour can be used for validation purposes with admission webhooks (Reference).
The change is opt-in, since it requires configuring a PreStop hook with sleep action of 0 second duration. So there is no risk beyond the upgrade/downgrade risks which are addressed in the Proposal section.
Refer to the Proposal section.
[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.
Alpha:
- Test that the runSleepHandler function returns immediately when given a duration of zero.
- Test that the validation succeeds when given a zero duration with the feature gate enabled.
- Test that the validation fails when given a zero duration with the feature gate disabled.
- Test that the validation returns the appropriate error messages when given an invalid duration value (e.g., a negative value) with the feature gate disabled and enabled.
- Unit tests for testing the disabling of the feature gate after it was enabled and the feature was used.
Current coverages:
k8s.io/kubernetes/pkg/apis/core/validation
: 2024-09-20 - 84.3k8s.io/kubernetes/pkg/kubelet/lifecycle/handlers
: 2024-09-20 - 86.4
N/A
Basic functionality
- Create a simple pod with a container that runs a long-running process.
- Add a preStop hook to the container configuration, using the new sleepAction with a sleep duration of
0
. - Delete the pod and observe the time it takes for the container to terminate.
- Verify that the container terminates immediately without sleeping.
- Feature implemented behind a feature flag
- Initial unit/e2e tests completed and enabled
- Gather feedback from developers and surveys
- Additional e2e tests are completed (if needed)
- No trouble reports from alpha release
- No trouble reports with the beta release, plus some anecdotal evidence of it being used successfully.
The previous PreStop Sleep Action behavior will not be broken. Users can continue to use their hooks as it is. To use this enhancement, users need to enable the feature gate, and set the sleep duration as zero in their prestop hook’s sleep action.
If the kube-apiserver is downgraded to a version where the feature gate is not supported (<v1.32), no new resources can be created with a PreStop sleep duration of zero seconds. Existing resources created with a sleep duration of zero will continue to function.
If the feature gate is turned off after being enabled, no new resources can be created with PreStop sleep duration of zero seconds. Existing resources will continue to run and use a sleep duration of zero seconds. These resources can be updated and the zero sleep duration would continue to work.
Only the kube-apiserver will need to enable the feature gate for the full featureset to be present. This is because the implementation is already handled in the parent KEP #3960. The change introduced in this KEP is only to how the validation is done. If the feature gate is disabled, the feature will not be available. The feature gate does not apply to the kubelet logic since the time.After function used by the original KEP already supports zero as a valid duration.
- Feature gate (also fill in values in
kep.yaml
)- Feature gate name: PodLifecycleSleepActionAllowZero
- Components depending on the feature gate: kube-apiserver
- Other
- Describe the mechanism:
- Will enabling / disabling the feature require downtime of the control plane?
- Will enabling / disabling the feature require downtime or reprovisioning of a node?
No
The feature can be disabled in Alpha and Beta versions by restarting kube-apiserver with the feature-gate off. In terms of Stable versions, users can choose to opt-out by not setting the sleep field.
New pods with sleep action in prestop sleep duration of zero seconds can be created.
For the parent KEP, unit tests for the switch
of the feature gate were added in pkg/registry/core/pod/strategy_test
. We can add similar tests for the new feature gate as well.
The change is opt-in, it doesn't impact already running workloads.
I believe we don't need a metric here since the parent KEP already has a metric to inform rollbacks. This KEP only updates the validation to allow zero value.
This is an opt-in feature, and it does not change any default behavior. I manually tested enabling and disabling this feature by changing the kube-api-server config and restarting them in a kind cluster. The details of the expected behavior are described in the Proposal and Upgrade/Downgrade sections.
The manual test steps are as following:
- Create a local 1.32 k8s cluster with kind, and create a test-pod in that cluster.
- Enable PodLifecycleSleepActionAllowZero feature in the kube-apiserver and restart it.
- Add a prestop hook with sleep action with duration of zero seconds to the test-pod and delete it. Observe the time cost.
- Create another pod with sleep action duration of zero seconds.
- Disable PodLifecycleSleepActionAllowZero feature in the kube-api-server and restart it.
- Delete the pod created in step 4, and observe the time cost.
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
No
Inspect the preStop hook configuration and also the feature gates
- Events
- Event Reason:
- API .status
- Condition name:
- Other field:
- Other (treat as last resort)
- Details: Check the logs of the container during termination, check the termination duration.
N/A
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
- Metrics
- Metric name:
- [Optional] Aggregation method:
- Components exposing the metric:
- Other (treat as last resort)
- Details: Check the logs of the container during termination, check the termination duration.
Are there any missing metrics that would be useful to have to improve observability of this feature?
N/A
No
No
No
No
No
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
No
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
No
Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
No
N/A. This is a change to validation within the API server.
N/A
Disable PodLifecycleSleepActionAllowZero
feature gate, and restart the kube-apiserver.
N/A
Another way to run zero duration sleep in a container is to use the exec command in preStop hook like so ["/bin/sh","-c","sleep 0"]
. This requires a sleep binary in the image. Since the sleep action already exists as a PreStop hook, it is easier to allow a duration of zero seconds for the sleep action.
N/A