-
Notifications
You must be signed in to change notification settings - Fork 16.8k
[incubator/kafka] Fix initContainer failure which did not error #4400
Conversation
/assign @benjigoldberg |
@@ -28,8 +28,10 @@ spec: | |||
- name: init-ext | |||
image: "{{ .Values.external.init.image }}:{{ .Values.external.init.imageTag }}" | |||
imagePullPolicy: "{{ .Values.external.init.imagePullPolicy }}" | |||
args: | |||
- -n ${POD_NAMESPACE} label pods ${POD_NAME} pod=${POD_NAME} | |||
command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josdotso can you go into more detail about why we need to sh
vs just use the scratch image? I don't follow the problem totally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the old from scratch image did not label the pods at all, exited 0, and produced no init container log at all.
With sh
in the mix, we can set -euxc
which tells sh
to error upon any non-zero command, error on any vars it cannot interpolate, enable shell verbosity enough to tell us what shell command ran. stdout gives useful output anyway -- but only on success. Even if the scratch approach worked in most cases, it would never tell us what shell command started kubectl. Therefore, we must use a shell here to properly debug the interpolation of the variables.
When label is not present, the services created for external access are set to use dead endpoints. This happens because selector doesn't match unlabeled pods.
Here's an example log output after this PR:
$ kubectl logs kafka-kafka-0 -c init-ext
+ kubectl label pods kafka-kafka-0 --namespace default pod=kafka-kafka-0
pod "kafka-kafka-0" labeled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation 👍 makes sense to me
/ok-to-test |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: benjigoldberg, josdotso The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks @benjigoldberg and sorry for the mishap! |
@josdotso np, it happens. Thanks for patching this! |
* upstream/master: (944 commits) Rename service port to http (helm#4442) [stable/neo4j] Change the image of the initContainer examples (helm#4269) move burrow to stable repo (helm#3481) Upgrade kube-state-metrics to 1.2.0, add new collectors (helm#4146) Add review guidelines around pvcs (helm#4223) [stable/parse] Release 0.3.10 (helm#4389) [stable/phabricator] Release 0.5.19 (helm#4433) Support exposing jmx and additional ports (helm#4072) Add default of "" for string comparison (helm#4420) [incubator/kafka] Makes readiness probe configurable (helm#3948) Published stash chart 0.7.0-rc.1 (helm#4410) Enable testing charts with test values (helm#4157) [incubator/kafka] Fix initContainer failure which did not error (helm#4400) [stable/etcd-operator] deployment typos and add tolerations (helm#4139) Typo fix in coscale/README.md (helm#4306) Typo fix in concourse/README.md (helm#4303) Typo fix in cockroachdb/README.md (helm#4302) [stable/jenkins] Bump appVersion (helm#4177) Typo fix in cluster-autoscaler/README.md (helm#4301) [stable/traefik] Bump appVersion to 1.5.4 (helm#4206) ...
…#4400) * [incubator/kafka] Fix initContainer failure which did not error * [incubator/kafka] Set initContainer to fail when vars are undefined
…#4400) * [incubator/kafka] Fix initContainer failure which did not error * [incubator/kafka] Set initContainer to fail when vars are undefined
…#4400) * [incubator/kafka] Fix initContainer failure which did not error * [incubator/kafka] Set initContainer to fail when vars are undefined Signed-off-by: voron <av@arilot.com>
What this PR does / why we need it:
Over in #3754, I submitted a PR to allow optional external access to Kafka. During review, I changed from one kubectl image to another -- and apparently did not adequately vet the change.
A commenter @piter42zx at #3754 describes a failure mode as seen at minikube: #3754 (comment)
Special notes for your reviewer:
This PR fixes the issue identified above. It uses a kubectl image which is not based on docker "scratch" -- and which brings with it
sh
so that we can be more paranoid about our init container's behavior. I've verified using the following override file that Kafka is exposed as expected by the commenter referenced above.To best validate that the external endpoint should be available, run: