[Roadmap] Support node specific configuration #2367

fabriziopandini · 2020-12-18T08:52:35Z

Kubeadm as of today does not provide a first-class support node-specific configuration.

There are ways to achieve this, like using patches, or for kubelet only, by providing node-specific extrargs, but in the end, the UX is not ideal and we are not covering all the use cases.

As a consequence, I'm proposing to include in the discussions for the next kubeadm roadmap also a definition of proper support for node-specific configurations in the kubeadm API, possibly in line with the outcomes of the ongoing discussion in kubernetes/enhancements#1439.

This should cover both the core components managed by kubeadm (API server, controller-manager, scheduler, etcd, kubelet) but also addon when this makes sense.

fejta-bot · 2021-05-04T20:39:01Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

fabriziopandini · 2021-05-05T09:39:50Z

/remove-lifecycle stale

k8s-triage-robot · 2021-08-03T09:44:56Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

fabriziopandini · 2021-08-05T08:55:19Z

/lifecycle frozen

neolit123 · 2024-11-28T17:03:00Z

we are already at v1beta4 and i just don't think v1 will have instance specific configuration that introduces new structures like NodeConfiguration that is stored in a CM for a given node and managed by kubeadm.

we actually also haven't seen demand for that from users. seems like they are fine with the patches and also keeping the init/joinconfiguration YAML around. the only difficult aspect has been the kubelet instance configuration and we are finally adding that with

Add kubelet instance configuration to configure CRI socket for each node #3042

maybe we can consider this for a vNext (e.g. v2), /shrug

mohag · 2024-11-29T07:40:12Z

My problem with the current patch support is that the parameter needs to be given with upgrades... Keeping the config for the path to the patches in the kubeadm-config configmap or an annotation on the node would help... (An annotation might make most sense...)

We often have the problem that people forget to apply the extra parameter when upgrading...

Edit: I'm referring to the kubelet config patches above, they are the most commonly used ones.

I also see that the setting can now be given in the InitConfiguration and JoinConfiguration, which I can't remember being the case previously... (I know that there were warnings passing --config to kubeadm upgrade last time I did that...) (I see the --patches parameter seems to be needed for upgrades instead of the config, looking at the note in the docs)

The non-kubelet patches might not have such an obvious place to save the path...

I see the note about the kubeletExtraArgs setting (we were mostly messing with /etc/defaults/kubelet instead, but often had to edit the kubeadm-generated file in the case of upgrades (migrating from docker to containerd, removing deprecated / removed flags), which is also not ideal...) (Using Ansible for much of it)

neolit123 · 2024-11-29T07:53:14Z

My problem with the current patch support is that the parameter needs to be given with upgrades... Keeping the config for the path to the patches in the kubeadm-config configmap or an annotation on the node would help... (An annotation might make most sense...)

one benefit of the patches approach is that it gives the user the control to not apply a certain patch (node config) to components when they are not needed on upgrade to version next. if during init/join a certain node config or the patches for it are written in the cluster that means the node config becomes pinned, and to fix that the user must go and edit a config map before upgrade if that is no longer needed. to me it seems the UX is not necessary better for this.

We often have the problem that people forget to apply the extra parameter when upgrading...

that is a user mistake, i would say.
v1beta4 includes an upgradeconfiguration and the "patches" field, so that will make the story a bit better - e.g. all kubeadm commands should accept a --config and you can have the patches field in there.

I see the note about the kubeletExtraArgs setting (we were mostly messing with /etc/defaults/kubelet instead, but often had to edit the kubeadm-generated file in the case of upgrades (migrating from docker to containerd, removing deprecated / removed flags), which is also not ideal...) (Using Ansible for much of it)

ideally all user provider kubelet flags should be gone. (as they are deprecated in the kubelet). but we will keep kubeletExtraArgs , of course,
we added #3042 for that purpose. first flag to migrate is the CRI socket.
but this effort will take a while.

mohag · 2024-11-29T08:42:16Z

My problem with the current patch support is that the parameter needs to be given with upgrades... Keeping the config for the path to the patches in the kubeadm-config configmap or an annotation on the node would help... (An annotation might make most sense...)

one benefit of the patches approach is that it gives the user the control to not apply a certain patch (node config) to components when they are not needed on upgrade to version next. if during init/join a certain node config or the patches for it are written in the cluster that means the node config becomes pinned, and to fix that the user must go and edit a config map before upgrade if that is no longer needed. to me it seems the UX is not necessary better for this.

The settings that we have in the patches seems unlikely to change significantly between versions: cgroupDriver (I think all have been migrated to systemd by now at least...), maxPods and resolvConf (we have had clusters with a mix of node operating systems, mostly for testing and then that setting needs to be set per-node) (criSocket would also make sense, but there is the kubeadm.alpha.kubernetes.io/cri-socket annotation for that as well and the interaction between the methods is unclear) (we currently still use kubelet parameters for the socket and then set the annotation to match)

(The non-kubelet configs is another story (I think you replied on the pre-edit version of my comment, before I remembered that patches are supported for more than just kubelet (we only use it for kubelet))) (kubelets have a 1:1 mapping to Node objects, the control plane components not necessarily (I do not think kubeadm supports control-plane nodes without kubelet though))

I do not think that keeping the contents of the patch on the cluster makes sense, but the path to it might (the UpgradeConfiguration would help as well though)

We often have the problem that people forget to apply the extra parameter when upgrading...

that is a user mistake, i would say. v1beta4 includes an upgradeconfiguration and the "patches" field, so that will make the story a bit better - e.g. all kubeadm commands should accept a --config and you can have the patches field in there.

I see the note about the kubeletExtraArgs setting (we were mostly messing with /etc/defaults/kubelet instead, but often had to edit the kubeadm-generated file in the case of upgrades (migrating from docker to containerd, removing deprecated / removed flags), which is also not ideal...) (Using Ansible for much of it)

ideally all user provider kubelet flags should be gone. (as they are deprecated in the kubelet). but we will keep kubeletExtraArgs , of course, we added #3042 for that purpose. first flag to migrate is the CRI socket. but this effort will take a while.

Currently, the kubelet flags that we go to a lot of effort to add, is --allowed-unsafe-sysctls on EKS nodes (not relevant for kubeadm) through a cloud-init script. (managed providers often give very little control over the kubelet configuration (which is often generated by a script running after cloud-init), so some way to keep some option would be useful - it might even be a flag to give a path to a patch that kubelet will apply to its own config for cases like this)

In the kubeadm cases, the playbook mainly removes the settings added previously (some of these clusters have been upgraded from ~1.15, so the parameters changed quite a bit over time) (I would prefer if that config was updated by kubeadm on upgrades) (The runtime settings is updated elsewhere if the runtime changes)

- name: Config cleanups for newer version
  block:
  - name: remove resolv-conf parameter (it is set from the config instead)
    ansible.builtin.replace:
      path: /var/lib/kubelet/kubeadm-flags.env
      regexp: ' ?--resolv-conf=/[^" ]*'
      replace: ''
    notify: restart kubelet
  - name: remove pod-infra-container-image parameter (If set to the default value)
    ansible.builtin.replace:
      path: /var/lib/kubelet/kubeadm-flags.env
      regexp: ' ?--pod-infra-container-image=(k8s.gcr.io|registry.k8s.io)/pause:[^ "]*'
      replace: ''
    notify: restart kubelet
  - name: remove cgroup-driver parameter (it is set from the config instead)
    ansible.builtin.replace:
      path: /var/lib/kubelet/kubeadm-flags.env
      regexp: ' ?--cgroup-driver=\w+'
      replace: ''
    notify: restart kubelet

neolit123 · 2024-11-29T08:56:29Z

(criSocket would also make sense, but there is the kubeadm.alpha.kubernetes.io/cri-socket annotation for that as well and the interaction between the methods is unclear) (we currently still use kubelet parameters for the socket and then set the annotation to match)

the link in my previous post is the feature which adds a feature gate to not use the annotation and instead use a var/lib/kubelet/instance-config.yaml (which is technically a patch).

I do not think that keeping the contents of the patch on the cluster makes sense, but the path to it might (the UpgradeConfiguration would help as well though)

i think it does help as it adds the consistency with --config.

Currently, the kubelet flags that we go to a lot of effort to add, is --allowed-unsafe-sysctls on EKS nodes (not relevant for kubeadm) through a cloud-init script. (managed providers often give very little control over the kubelet configuration (which is often generated by a script running after cloud-init), so some way to keep some option would be useful - it might even be a flag to give a path to a patch that kubelet will apply to its own config for cases like this)

the kubelet exposes it as a field as well:
https://github.com/kubernetes/kubernetes/blob/810e9e212ec5372d16b655f57b9231d8654a2179/staging/src/k8s.io/kubelet/config/v1beta1/types.go#L699
so you can use a kubeadm patch with a kubeletconfiguration target instead of passing kubeletExtraArgs for the nodes that need it.

In the kubeadm cases, the playbook mainly removes the settings added previously (some of these clusters have been upgraded from ~1.15, so the parameters changed quite a bit over time) (I would prefer if that config was updated by kubeadm on upgrades) (The runtime settings is updated elsewhere if the runtime changes)

we do perform cleanup but only for the flags kubeadm maintains (i.e. not for --allowed-unsafe-sysctls0. if we missed doing cleanup on a managed flag that is no longer needed that's a kubeadm bug.

mohag · 2024-12-03T09:09:05Z

the link in my previous post is the feature which adds a feature gate to not use the annotation and instead use a var/lib/kubelet/instance-config.yaml (which is technically a patch).

That config, if enabled by default or by a command-line parameter, would work for our EKS issue as well... (where kubeadm is not involved and they generate the config from their node bootstrap script (called from the end of a cloud-init script), so changing settings in the config is hard there)

i think it does help as it adds the consistency with --config.

Correct 🎉

Currently, the kubelet flags that we go to a lot of effort to add, is --allowed-unsafe-sysctls on EKS nodes (not relevant for kubeadm) through a cloud-init script. (managed providers often give very little control over the kubelet configuration (which is often generated by a script running after cloud-init), so some way to keep some option would be useful - it might even be a flag to give a path to a patch that kubelet will apply to its own config for cases like this)

the kubelet exposes it as a field as well: https://github.com/kubernetes/kubernetes/blob/810e9e212ec5372d16b655f57b9231d8654a2179/staging/src/k8s.io/kubelet/config/v1beta1/types.go#L699 so you can use a kubeadm patch with a kubeletconfiguration target instead of passing kubeletExtraArgs for the nodes that need it.

I do know it is in the config, but we can't control the config that AWS generates with EKS.... (for the kubeadm cluster that would be useful) (we currently only set that for a specific customer's application where we need to tweak network buffers and max connections) (whether the items are actually safe and missing from the default allowlist might be the real question)

It might be possible to add an instance-config, but AWS might use that for their own purposes.... (an instance-config.d directory might be the other way...)

we do perform cleanup but only for the flags kubeadm maintains (i.e. not for --allowed-unsafe-sysctls0. if we missed doing cleanup on a managed flag that is no longer needed that's a kubeadm bug.

With kubeadm, those tend to go into /etc/default/kubelet. (It is a lot easier if just one tool mess with a config) (We don't currently have extra settings on any of the kubeadm-deployed clusters)

I suspect that in our case, we mainly messed with it when switching existing servers from Docker to containerd. (I'm not sure if setting the annotation / patches and running kubeadm upgrade node might have done that as well (or at least moved the setting to the config))

neolit123 · 2024-12-03T09:16:51Z

the link in my previous post is the feature which adds a feature gate to not use the annotation and instead use a var/lib/kubelet/instance-config.yaml (which is technically a patch).

That config, if enabled by default or by a command-line parameter, would work for our EKS issue as well... (where kubeadm is not involved and they generate the config from their node bootstrap script (called from the end of a cloud-init script), so changing settings in the config is hard there)

note, the kubelet has another way of doing instance config without kubeadm.
https://github.com/kubernetes/enhancements/blob/3911f017579157f35a297c58b66063e1216ba487/keps/sig-node/3983-drop-in-configuration/README.md?plain=1

we do perform cleanup but only for the flags kubeadm maintains (i.e. not for --allowed-unsafe-sysctls0. if we missed doing cleanup on a managed flag that is no longer needed that's a kubeadm bug.

With kubeadm, those tend to go into /etc/default/kubelet. (It is a lot easier if just one tool mess with a config) (We don't currently have extra settings on any of the kubeadm-deployed clusters)

kubeadm is only going to cleanup kubeadm-flags.env as the /etc/default/kubelet is user owned.

I suspect that in our case, we mainly messed with it when switching existing servers from Docker to containerd. (I'm not sure if setting the annotation / patches and running kubeadm upgrade node might have done that as well (or at least moved the setting to the config))

the migration guide had instructions to modify the annotation and kubeadm-flags.env, FWIW
https://kubernetes.io/docs/tasks/administer-cluster/migrating-from-dockershim/migrate-dockershim-dockerd/

This was referenced Dec 18, 2020

RFE: Document the kubeadm roadmap #2315

Open

Wrong etcd listen-metrics-urls customization for second and further controlPlane nodes kubernetes-sigs/cluster-api#4024

Closed

neolit123 added this to the v1.21 milestone Jan 8, 2021

neolit123 added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 8, 2021

neolit123 modified the milestones: v1.21, Next Feb 3, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 4, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 5, 2021

This was referenced May 26, 2021

add UpgradeConfiguration/ResetConfiguration API types #2489

Closed

kubeadm -1.21 ignoring enable-admission-plugins #2496

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 3, 2021

k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 5, 2021

neolit123 mentioned this issue Feb 1, 2022

Kubeadm to have an object in Kubeadm config for the Node Specific kubelet configurations #2647

Closed

neolit123 added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Nov 8, 2023

bogd mentioned this issue Sep 9, 2024

Unable to upgrade from 1.29.x to 1.30.x kubernetes-sigs/kubespray#11500

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Roadmap] Support node specific configuration #2367

[Roadmap] Support node specific configuration #2367

fabriziopandini commented Dec 18, 2020

fejta-bot commented May 4, 2021

fabriziopandini commented May 5, 2021

k8s-triage-robot commented Aug 3, 2021

fabriziopandini commented Aug 5, 2021

neolit123 commented Nov 28, 2024

mohag commented Nov 29, 2024 •

edited

Loading

neolit123 commented Nov 29, 2024 •

edited

Loading

mohag commented Nov 29, 2024

neolit123 commented Nov 29, 2024 •

edited

Loading

mohag commented Dec 3, 2024

neolit123 commented Dec 3, 2024 •

edited

Loading

[Roadmap] Support node specific configuration #2367

[Roadmap] Support node specific configuration #2367

Comments

fabriziopandini commented Dec 18, 2020

fejta-bot commented May 4, 2021

fabriziopandini commented May 5, 2021

k8s-triage-robot commented Aug 3, 2021

fabriziopandini commented Aug 5, 2021

neolit123 commented Nov 28, 2024

mohag commented Nov 29, 2024 • edited Loading

neolit123 commented Nov 29, 2024 • edited Loading

mohag commented Nov 29, 2024

neolit123 commented Nov 29, 2024 • edited Loading

mohag commented Dec 3, 2024

neolit123 commented Dec 3, 2024 • edited Loading

mohag commented Nov 29, 2024 •

edited

Loading

neolit123 commented Nov 29, 2024 •

edited

Loading

neolit123 commented Nov 29, 2024 •

edited

Loading

neolit123 commented Dec 3, 2024 •

edited

Loading