From a281c6871c63ce982f7cf46477439b27493c98f9 Mon Sep 17 00:00:00 2001 From: Michal Wozniak Date: Fri, 20 Oct 2023 12:49:08 +0200 Subject: [PATCH 1/2] Add known failure mode for Server Side Apply --- .../555-server-side-apply/README.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/keps/sig-api-machinery/555-server-side-apply/README.md b/keps/sig-api-machinery/555-server-side-apply/README.md index 1e5727c3c67..5a79d03c484 100644 --- a/keps/sig-api-machinery/555-server-side-apply/README.md +++ b/keps/sig-api-machinery/555-server-side-apply/README.md @@ -506,6 +506,7 @@ _This section must be completed when targeting beta graduation to a release._ * **What are other known failure modes?** For each of them, fill in the following information by copying the below template: + + - SSA updates fail for pods with duplicated env. names or container ports. + - Known bug at least since 1.26 + - Bugs: + - [Pod container ports and env-vars listMapKeys != validation](https://github.com/kubernetes/kubernetes/issues/113482) + - [Pod Garbage collector fails to clean up PODs from nodes that are not running anymore](https://github.com/kubernetes/kubernetes/issues/118261) (fixed in [#121103](https://github.com/kubernetes/kubernetes/pull/121103)) + - Detection: The SSA request fails with 500. The response message is similar to the following: + `'failed to create manager for existing fields: failed to convert new object (app-b/app-b-5894548cb-7tssd; /v1, Kind=Pod) to smd typed: .spec.containers[name="app-b"].ports: duplicate entries for key [containerPort=8082,protocol="TCP"]'`. + - Mitigations: Make sure pods with duplicated keys for env. variables or + container pods are not created. Also, update the existing pods to cleanup + the problematic fields. + - Testing: [PodGC integration test](https://github.com/kubernetes/kubernetes/blob/7b9d244efd19f0d4cce4f46d1f34a6c7cff97b18/test/integration/podgc/podgc_test.go#L313) + reproduced the issue before withdrawing from SSA in PodGC in the [PR #121103](https://github.com/kubernetes/kubernetes/pull/121103). * **What steps should be taken if SLOs are not being met to determine the problem?** n/a From 2c4e07399315dd4de96fe1b29ba109b1f085e8a8 Mon Sep 17 00:00:00 2001 From: Michal Wozniak Date: Tue, 23 Jan 2024 12:21:57 +0100 Subject: [PATCH 2/2] add a note that it only refers to the status --- keps/sig-api-machinery/555-server-side-apply/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/keps/sig-api-machinery/555-server-side-apply/README.md b/keps/sig-api-machinery/555-server-side-apply/README.md index 5a79d03c484..571f7f7381c 100644 --- a/keps/sig-api-machinery/555-server-side-apply/README.md +++ b/keps/sig-api-machinery/555-server-side-apply/README.md @@ -517,8 +517,8 @@ _This section must be completed when targeting beta graduation to a release._ Not required until feature graduated to beta. - Testing: Are there any tests for failure mode? Failure modes are tested exhaustively both as unit-tests and as integration tests. --> - - SSA updates fail for pods with duplicated env. names or container ports. - - Known bug at least since 1.26 + - SSA status updates fail for pods with duplicated env. names or container ports. + - Known bug at least since 1.26, fixed in 1.29 with in [Update sigs.k8s.io/structured-merge-diff to v4.4.0](https://github.com/kubernetes/kubernetes/pull/121575) - Bugs: - [Pod container ports and env-vars listMapKeys != validation](https://github.com/kubernetes/kubernetes/issues/113482) - [Pod Garbage collector fails to clean up PODs from nodes that are not running anymore](https://github.com/kubernetes/kubernetes/issues/118261) (fixed in [#121103](https://github.com/kubernetes/kubernetes/pull/121103))