-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document the workaround for kube-proxy's mishandled "hostname-override" flag #9878
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -46,24 +46,23 @@ If you notice that `kubeadm init` hangs after printing out the following line: | |
|
||
This may be caused by a number of problems. The most common are: | ||
|
||
- network connection problems. Check that your machine has full network connectivity before continuing. | ||
- the default cgroup driver configuration for the kubelet differs from that used by Docker. | ||
- Network connection problems. | ||
Check that your machine has full network connectivity before continuing. | ||
- The default cgroup driver configuration for the kubelet differs from that used by Docker. | ||
Check the system log file (e.g. `/var/log/message`) or examine the output from `journalctl -u kubelet`. If you see something like the following: | ||
|
||
```shell | ||
```sh | ||
error: failed to run Kubelet: failed to create kubelet: | ||
misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs" | ||
``` | ||
|
||
There are two common ways to fix the cgroup driver problem: | ||
1. Install docker again following instructions | ||
|
||
1. Install Docker again following instructions | ||
[here](/docs/setup/independent/install-kubeadm/#installing-docker). | ||
1. Change the kubelet config to match the Docker cgroup driver manually, you can refer to | ||
[Configure cgroup driver used by kubelet on Master Node](/docs/setup/independent/install-kubeadm/#configure-cgroup-driver-used-by-kubelet-on-master-node) | ||
for detailed instructions. | ||
1. Change the kubelet config to match the Docker cgroup driver manually. | ||
You can refer to [Configure cgroup driver used by kubelet on Master Node](/docs/setup/independent/install-kubeadm/#configure-cgroup-driver-used-by-kubelet-on-master-node) for detailed instructions. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Change to:
So:
|
||
|
||
- control plane Docker containers are crashlooping or hanging. You can check this by running `docker ps` and investigating each container by running `docker logs`. | ||
- Control plane Docker containers are crashlooping or hanging. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Change to:
|
||
You can check this by running `docker ps` and investigating each container by running `docker logs`. | ||
|
||
## kubeadm blocks when removing managed containers | ||
|
||
|
@@ -201,41 +200,65 @@ Error from server: Get https://10.19.0.41:10250/containerLogs/default/mysql-ddc6 | |
|
||
## Services with externalTrafficPolicy=Local are not reachable | ||
|
||
On nodes where the hostname for the kubelet is overridden using the `--hostname-override` option, kube-proxy will default to treating 127.0.0.1 as the node IP, which results in rejecting connections for Services configured for `externalTrafficPolicy=Local`. This situation can be verified by checking the output of `kubectl -n kube-system logs <kube-proxy pod name>`: | ||
On machines where the node name for the kubelet is overridden using the `--hostname-override` option, such that the node name no longer matches the machine's hostname, kube-proxy will default to treating 127.0.0.1 as the node's IP address, which results in rejecting connections for Services configured for `externalTrafficPolicy=Local`. This situation can be verified by checking the output of `kubectl -n kube-system logs <kube-proxy pod name>`: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Replace with:
So:
|
||
|
||
```sh | ||
W0507 22:33:10.372369 1 server.go:586] Failed to retrieve node info: nodes "ip-10-0-23-78" not found | ||
W0507 22:33:10.372474 1 proxier.go:463] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP | ||
``` | ||
|
||
A workaround for this is to modify the kube-proxy DaemonSet in the following way: | ||
Just as the kubelet accepts a `--hostname-override` flag to use a different name for the node, so too does kube-proxy. Unfortunately, though, due to an unresolved defect, kube-proxy does not honor that flag's value when used in combination with the `--config` flag specifying a configuration file. Hence we can't use kube-proxy's `--hostname-override` flag to tell it the proper name of our node. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Replace with:
So:
|
||
|
||
```sh | ||
kubectl -n kube-system patch --type json daemonset kube-proxy -p "$(cat <<'EOF' | ||
[ | ||
{ | ||
"op": "add", | ||
"path": "/spec/template/spec/containers/0/env", | ||
"value": [ | ||
{ | ||
"name": "NODE_NAME", | ||
"valueFrom": { | ||
"fieldRef": { | ||
"apiVersion": "v1", | ||
"fieldPath": "spec.nodeName" | ||
} | ||
} | ||
} | ||
] | ||
}, | ||
{ | ||
"op": "add", | ||
"path": "/spec/template/spec/containers/0/command/-", | ||
"value": "--hostname-override=${NODE_NAME}" | ||
} | ||
] | ||
EOF | ||
)" | ||
To work around this problem, modify the kube-proxy DaemonSet as follows. The idea is to customize the common _config.conf_ file kube-proxy mounts from the "kube-proxy" ConfigMap, so that each machine uses its own copy of that file with the "hostnameOverride" set to the desired node name. We store our customized copy in an `emptyDir` volume; the kube-proxy container then reads from there, instead of reading the original file mounted from the ConfigMap. | ||
|
||
We use an idempotent strategic merge patch here, so that applying the same patch more than once is harmless. Still, though, we have to deal with `kubectl patch` returning a failing exit code when it determines that no change is necessary. | ||
|
||
```sh | ||
suspected='^daemonset\.(extensions|apps)/kube-proxy not patched$' | ||
patch_output=$( \ | ||
kubectl --namespace=kube-system \ | ||
patch daemonset kube-proxy \ | ||
--patch=' | ||
spec: | ||
template: | ||
spec: | ||
volumes: | ||
- name: shared-data | ||
emptyDir: {} | ||
initContainers: | ||
- name: update-config-file | ||
image: busybox | ||
imagePullPolicy: IfNotPresent | ||
command: | ||
- sh | ||
- -c | ||
- "/bin/sed \"s/hostnameOverride: \\\"\\\"/hostnameOverride: $(NODE_NAME)/\" /var/lib/kube-proxy/config.conf > /shared-data/config.conf" | ||
env: | ||
- name: NODE_NAME | ||
valueFrom: | ||
fieldRef: | ||
apiVersion: v1 | ||
fieldPath: spec.nodeName | ||
volumeMounts: | ||
- name: kube-proxy | ||
mountPath: /var/lib/kube-proxy | ||
readOnly: true | ||
- name: shared-data | ||
mountPath: /shared-data | ||
containers: | ||
- name: kube-proxy | ||
command: | ||
- /usr/local/bin/kube-proxy | ||
- --config=/shared-data/config.conf | ||
volumeMounts: | ||
- name: shared-data | ||
mountPath: /shared-data | ||
readOnly: true') | ||
if [ $? != 0 ] && [[ ! "${patch_output}" =~ ${suspected} ]]; then | ||
code=$? | ||
echo "Failed to patch kube-proxy DaemonSet: ${patch_output}" 1>&2 | ||
exit ${code} | ||
fi | ||
``` | ||
{{% /capture %}} | ||
Note the use of the _busybox_ container image in the init container. If you wish to avoid Docker needing to pull that image on each machine that runs the kube-proxy pod, consider pulling the _busybox_ image ahead of time, such as when building the machine image. That way, these machines are born with _busybox_ already available. | ||
{{% /capture %}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this paragraph to:
So:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fine suggestion, here and in what follows, but note that I was just trying to adjust the formatting and capitalization here. This is the unfortunate phenomenon of drawing focus only to what one touches, at the expense of all the other parts that were ignored.
Still, I appreciate that you did think through your suggestions, so I'll take them into account when I can make another editing pass here.