Skip to content

Commit

Permalink
Update PRR requirements, respond to more feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
ehashman committed Jan 27, 2021
1 parent 2944ab9 commit d8496c0
Show file tree
Hide file tree
Showing 3 changed files with 31 additions and 11 deletions.
3 changes: 3 additions & 0 deletions keps/prod-readiness/sig-node/2238.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
kep-number: 2238
alpha:
approver: "@johnbelamaric"
28 changes: 18 additions & 10 deletions keps/sig-node/2238-liveness-probe-grace-period/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,11 @@ normal shutdown and when probes fail. Hence, if a long termination period is
set, and a liveness probe fails, a workload will not be promptly restarted
because it will wait for the full termination period.

I propose adding a new field to liveness probes, `gracePeriodSeconds`. When
set, it will override the `terminationGracePeriodSeconds` for liveness
termination. This maintains the current behaviour if desired while providing
configuration to address this unintended behaviour.
I propose adding a new field to probes, `probe.terminationGracePeriodSeconds`.
When set, it will override the `terminationGracePeriodSeconds` for liveness or
startup termination, and will be ignored for readiness probes. This maintains
the current behaviour if desired while providing configuration to address this
unintended behaviour.

## Motivation

Expand Down Expand Up @@ -182,8 +183,9 @@ probes should be tested for correct behaviour.

E2E integration tests will verify the correct behaviour for the current broken
use case, where a `terminationGracePeriodSeconds` is set to a
large value, a probe fails, and with `livenessProbe.gracePeriodSeconds` set,
the container terminates quickly.
large value, a probe fails, and with
`livenessProbe.terminationGracePeriodSeconds` set, the container terminates
quickly.

### Graduation Criteria

Expand Down Expand Up @@ -312,7 +314,7 @@ _This section must be completed when targeting alpha to a release._

* **How can this feature be enabled / disabled in a live cluster?**
- [X] Feature gate (also fill in values in `kep.yaml`)
- Feature gate name: ProbeTerminationGracePeriod
- Feature gate name: `ProbeTerminationGracePeriod`
- Components depending on the feature gate: Kubelet, API Server
- [ ] Other
- Describe the mechanism:
Expand Down Expand Up @@ -350,9 +352,9 @@ _This section must be completed when targeting alpha to a release._
with and without the feature, are necessary. At the very least, think about
conversion tests if API types are being modified.

Yes. We will add an e2e test to confirm the presence of the current bug.
Then, with the new API field added, we will add e2e tests to confirm that they
result in different behaviour.
We will manually test that if a resource is created with the new field, and
the feature flag is later disabled, that the field will no longer be used even
though it is already filled in.

### Rollout, Upgrade and Rollback Planning

Expand Down Expand Up @@ -560,6 +562,12 @@ immediately, or with some default grace period (@sjenning)
- CON: even if we use a feature flag for this, it will eventually become the
required behaviour

Introduce a similar field, but at the pod-level, next to the existing field.

- PRO: similar values are grouped together
- CON: logically, liveness probes operate on a per-container basis, and thus we
may need different thresholds for different containers

## Infrastructure Needed (Optional)

<!--
Expand Down
11 changes: 10 additions & 1 deletion keps/sig-node/2238-liveness-probe-grace-period/kep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ authors:
owning-sig: sig-node
participating-sigs:
- sig-node
status: provisional
status: implementable
creation-date: 2021-01-07
reviewers:
- "@matthyx"
Expand All @@ -30,3 +30,12 @@ milestone:
alpha: "v1.21"
beta: "v1.22"
stable: "v1.23"

# The following PRR answers are required at alpha release
# List the feature gate name and the components for which it must be enabled
feature-gates:
- name: ProbeTerminationGracePeriod
components:
- kube-apiserver
- kubelet
disable-supported: true

0 comments on commit d8496c0

Please sign in to comment.