@@ -236,7 +236,26 @@ To investigate the root cause of a `CrashLoopBackOff` issue, a user can:
236236 application code. Running this container image locally or in a development
237237 environment can help diagnose application specific issues.
238238
239- ### Container restart policy {#restart-policy}
239+ ### Container restarts {#restart-policy}
240+
241+ When a container in your Pod stops, or experiences failure, Kubernetes can restart it.
242+ A restart isn't always appropriate; for example,
243+ {{< glossary_tooltip text="init containers" term_id="init-container" >}} run only once,
244+ during Pod startup.
245+ <!-- TODO reword when ContainerRestartRules graduates -->
246+ You can configure restarts as a policy that applies to all Pods, or using container-level configuration (for example: when you define a
247+ {{< glossary_tooltip text="sidecar container" term_id="sidecar-container" >}}).
248+
249+ #### Container restarts and resilience {#container-restart-resilience}
250+
251+ The Kubernetes project recommends following cloud-native principles, including resilient
252+ design that accounts for unannounced or arbitrary restarts. You can achieve this either
253+ by failing the Pod and relying on automatic
254+ [ replacement] ( /docs/concepts/workloads/controllers/ ) , or you can design for container-level resilience.
255+ Either approach helps to ensure that your overall workload remains available despite
256+ partial failure.
257+
258+ #### Pod-level container restart policy
240259
241260The ` spec ` of a Pod has a ` restartPolicy ` field with possible values Always, OnFailure,
242261and Never. The default value is Always.
@@ -262,6 +281,104 @@ problems, the kubelet resets the restart backoff timer for that container.
262281[ Sidecar containers and Pod lifecycle] ( /docs/concepts/workloads/pods/sidecar-containers/#sidecar-containers-and-pod-lifecycle )
263282explains the behaviour of ` init containers ` when specify ` restartpolicy ` field on it.
264283
284+ #### Individual container restart policy and rules {#container-restart-rules}
285+
286+ {{< feature-state
287+ feature_gate_name="ContainerRestartRules" >}}
288+
289+ If your cluster has the feature gate ` ContainerRestartRules ` enabled, you can specify
290+ ` restartPolicy ` and ` restartPolicyRules ` on _ inidividual containers_ to override the Pod
291+ restart policy. Container restart policy and rules applies to {{< glossary_tooltip text="app containers" term_id="app-container" >}}
292+ in the Pod and to regular [ init containers] ( /docs/concepts/workloads/pods/init-containers/ ) .
293+
294+ A Kubernetes-native [ sidecar container] ( /docs/concepts/workloads/pods/sidecar-containers/ )
295+ has its container-level ` restartPolicy ` set to ` Always ` , and does not support ` restartPolicyRules ` .
296+
297+ The container restarts will follow the same exponential backoff as pod restart policy described above.
298+ Supported container restart policies:
299+
300+ * ` Always ` : Automatically restarts the container after any termination.
301+ * ` OnFailure ` : Only restarts the container if it exits with an error (non-zero exit status).
302+ * ` Never ` : Does not automatically restart the terminated container.
303+
304+ Additionally, _ individual containers_ can specify ` restartPolicyRules ` . If the ` restartPolicyRules `
305+ field is specified, then container ` restartPolicy ` ** must** also be specified. The ` restartPolicyRules `
306+ define a list of rules to apply on container exit. Each rule will consist of a condition
307+ and an action. The supported condition is ` exitCodes ` , which compares the exit code of the container
308+ with a list of given values. The supported action is ` Restart ` , which means the container will be
309+ restarted. The rules will be evaluated in order. On the first match, the action will be applied.
310+ If none of the rules’ conditions matched, Kubernetes fallback to container’s configured
311+ ` restartPolicy ` .
312+
313+ For example, a Pod with OnFailure restart policy that have a ` try-once ` container. This allows
314+ Pod to only restart certain containers:
315+
316+ ``` yaml
317+ apiVersion : v1
318+ kind : Pod
319+ metadata :
320+ name : on-failure-pod
321+ spec :
322+ restartPolicy : OnFailure
323+ containers :
324+ - name : try-once-container # This container will run only once because the restartPolicy is Never.
325+ image : docker.io/library/busybox:1.28
326+ command : ['sh', '-c', 'echo "Only running once" && sleep 10 && exit 1']
327+ restartPolicy : Never
328+ - name : on-failure-container # This container will be restarted on failure.
329+ image : docker.io/library/busybox:1.28
330+ command : ['sh', '-c', 'echo "Keep restarting" && sleep 1800 && exit 1']
331+ ` ` `
332+
333+ A Pod with Always restart policy with an init container that only execute once. If the init
334+ container fails, the Pod fails. This alllows the Pod to fail if the initialiaztion failed,
335+ but also keep running once the initialization succeeds:
336+
337+ ` ` ` yaml
338+ apiVersion : v1
339+ kind : Pod
340+ metadata :
341+ name : fail-pod-if-init-fails
342+ spec :
343+ restartPolicy : Always
344+ initContainers :
345+ - name : init-once # This init container will only try once. If it fails, the pod will fail.
346+ image : docker.io/library/busybox:1.28
347+ command : ['sh', '-c', 'echo "Failing initialization" && sleep 10 && exit 1']
348+ restartPolicy : Never
349+ containers :
350+ - name : main-container # This container will always be restarted once initialization succeeds.
351+ image : docker.io/library/busybox:1.28
352+ command : ['sh', '-c', 'sleep 1800 && exit 0']
353+ ` ` `
354+
355+ A Pod with Never restart policy with a container that ignores and restarts on specific exit codes.
356+ This is useful to differentiate between restartable errors and non-restartable errors:
357+
358+ ` ` ` yaml
359+ apiVersion : v1
360+ kind : Pod
361+ metadata :
362+ name : restart-on-exit-codes
363+ spec :
364+ restartPolicy : Never
365+ containers :
366+ - name : restart-on-exit-codes
367+ image : docker.io/library/busybox:1.28
368+ command : ['sh', '-c', 'sleep 60 && exit 0']
369+ restartPolicy : Never # Container restart policy must be specified if rules are specified
370+ restartPolicyRules : # Only restart the container if it exits with code 42
371+ - action : Restart
372+ exitCodes :
373+ operator : In
374+ values : [42]
375+ ` ` `
376+
377+ Restart rules can be used for many more advanced lifecycle management scenarios. Note, restart rules
378+ are affected by the same inconsistencies as the regular restart policy. Kubelet restarts, container
379+ runtime garbage collection, intermitted connectivity issues with the control plane may cause the state
380+ loss and containers may be re-run even when you expect a container not to be restarted.
381+
265382### Reduced container restart delay
266383
267384{{< feature-state
0 commit comments