Kubeadm: Remove ClusterStatus from kubeadm-config #1380

fabriziopandini · 2019-11-24T14:42:23Z

This KEP is proposing a new mode for tracking the list of the API endpoints in a cluster, thus allowing to remove the ClusterStatus entry in the kubeadm-config ConfigMap and solve the problems that arise when, for any reasons, such entry does not reflect anymore the real status of the cluster.

/sig cluster-lifecycle
/assign @neolit123 @rosti @ereslibre @ncdc @timothysc

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md

ereslibre

I like removing shared state.

I guess one of the motivations for this KEP is to improve the multiple control plane join at once case. Is this correct? Removing shared state is nice, but I think it's harder to block on a known shared resource (as we could with a shared configmap -- with leader election).

Here, when a control plane node joins, it will have to:

List nodes, retrieve annotations
Perform the etcd join with the list of current endpoints
Let the kubelet register with TLS bootstrap
Annotate node

I see potential race conditions if several control planes are joining at the same time, and I believe it's harder to sync on several annotations on different resources (nodes), than to sync on a shared config map. What do you think about this? Should it be mentioned under risks?

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md

ereslibre · 2019-11-25T09:59:56Z

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md

+
+The `LocalAPIEndpoint` is also used in the stacked `etcd` pod manifest for composing the `peer-urls` and the `client-urls`; the latter is used by kubeadm when accessing etcd in an existing cluster, e.g. when doing `join --control-plane`.
+
+We are going to echo the `client-urls` value into a new annotation named `kubeadm.kubernetes.io/etcd.advertise-client-urls`. Once the annotation will be in place, it will be possible to easily retrieve the etcd client urls by querying the `etcd` pods.


Currently, we are composing this information with:

func GetClientURL(localEndpoint *kubeadmapi.APIEndpoint) string { return "https://" + net.JoinHostPort(localEndpoint.AdvertiseAddress, strconv.Itoa(constants.EtcdListenClientPort)) }

So, we are today setting etcd's advertise-client-urls forcing the port to the etcd listen client one, reusing the current local endpoint advertise address.

What's the goal of this new annotation as opposed to kubeadm.kubernetes.io/kube-apiserver.advertise-address? Couldn't we theoretically compose the advertise-client-urls based on kubeadm.kubernetes.io/kube-apiserver.advertise-address?

The goal is to build the list of the etcd peers looking at the current etcd pods (instead of looking at the kube-apiserver pods, which can be somehow confusing or eventually also error-prone)

This is especially helpful in the current join --control-plane sequence because the kube-apiserver and the etcd static pods are created in two separated moments

rosti

Thanks @fabriziopandini !
I like the idea very much! However, we must also ensure, that we don't break anyone who's relying on ClusterStatus-es.

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md

rosti · 2019-11-25T10:09:39Z

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md

+During upgrades:
+
+- The new annotations `kubeadm.kubernetes.io/kube-apiserver.advertise-address` and `kubeadm.kubernetes.io/etcd.advertise-client-urls` will be generated during the upgrade of the static pod manifests.
+- The `ClusterStatus` entry will be cleaned up during the upgrade of the `kubeadm-config` ConfigMap.


We need to guarantee that ClusterStatus-es are kept together with annotations for some time (they can be considered beta, so at least 9 months/3 cycles) not to break people.

see [1]
The whole point of this KEP is that it is not possible to keep ClusterStatus up to date with the current status.

we discussed during kubeadm office hours that we would like to keep the ClusterStatus for a while.
@fabriziopandini please confirm. does this section needs a minor update?

fabriziopandini · 2019-11-25T12:41:16Z

@ereslibre

Here, when a control plane node joins, it will have to:

List nodes, retrieve annotations

Perform the etcd join with the list of current endpoints

Let the kubelet register with TLS bootstrap

Annotate node

I think that there is some misunderstanding here.
This KEP is proposing to add annotations on the kube-apiserver static pod manifest and on the etcd static pod manifest. No annotation at node level is part of this proposal

The join process will be:

create control-plane static pod manifests (as of today, only with an additional annotation on the kubea-api server static pod manifest)
start kubelet (as of today)
get the list of current etcd pods/kubeadm.kubernetes.io/etcd.advertise-client-urls (replacement of get ClusterStatus)
create control-plane static pod manifests (as of today, only with an additional annotation)

Considering that I don't see potential race conditions, but happy to discuss this if this is not clear yet

ereslibre

Thank you for the clarifications @fabriziopandini. This LGTM.

timothysc

I like it. Sorry for my late response, pre-holidays was nuts.
I have some minor questions but I'm unblocking approval.

/approve

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md

timothysc · 2020-01-07T20:02:53Z

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md

+
+We are going to echo the `client-urls` value into a new annotation named `kubeadm.kubernetes.io/etcd.advertise-client-urls`. Once the annotation will be in place, it will be possible to easily retrieve the etcd client urls by querying the `etcd` pods.
+
+### Risks and Mitigations


I have not checked, but if a static pod goes down is that always removed from the pod list? Also LBs would still need to update periodically.

if the static manifest is removed the mirror Pod should be removed too.
but if the manifest is in place the kubelet will try to create the Pod. it would then be possible do kubectl describe po on it to get the annotation.

timothysc · 2020-01-07T20:04:09Z

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md

+
+### Test Plan
+
+No additional test E2E test are required for this change because all the affected behaviors are already covered by existing E2E test.


Do we do a destructive LB update test?

timothysc · 2020-01-07T20:05:41Z

/hold
for other comments and lgtms

ereslibre · 2020-01-08T16:24:57Z

/lgtm

Feel free to unhold.

rosti

My comments were non-blocking too.
/lgtm

fabriziopandini · 2020-01-22T15:27:13Z

@neolit123 @timothysc all the comments are addressed, please PTAL

neolit123 · 2020-01-22T17:55:50Z

/lgtm

k8s-ci-robot · 2020-01-22T17:56:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ereslibre, fabriziopandini, neolit123, rosti, timothysc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-cluster-lifecycle/OWNERS~~ [fabriziopandini,neolit123,timothysc]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

neolit123 · 2020-01-22T17:56:38Z

please "hold cancel" before Tuesday next week.

ereslibre · 2020-01-24T18:08:34Z

Tim's comments were non blocking and there's consensus on this KEP. Thank you everyone!

/hold cancel

remove-clusterstatus-from-kubeadm-config

e993e32

k8s-ci-robot assigned ereslibre, ncdc and neolit123 Nov 24, 2019

k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Nov 24, 2019

k8s-ci-robot assigned rosti and timothysc Nov 24, 2019

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 24, 2019

k8s-ci-robot requested review from justinsb and timothysc November 24, 2019 14:42

neolit123 reviewed Nov 24, 2019

View reviewed changes

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md Show resolved Hide resolved

neolit123 reviewed Nov 24, 2019

View reviewed changes

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md Outdated Show resolved Hide resolved

neolit123 reviewed Nov 24, 2019

View reviewed changes

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md Outdated Show resolved Hide resolved

ereslibre reviewed Nov 25, 2019

View reviewed changes

keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md Outdated Show resolved Hide resolved

ereslibre reviewed Nov 25, 2019

View reviewed changes

rosti approved these changes Nov 25, 2019

View reviewed changes

ereslibre approved these changes Nov 25, 2019

View reviewed changes

fabriziopandini mentioned this pull request Dec 30, 2019

kubeadm join is not fault tolerant to etcd endpoint failures kubernetes/kubeadm#1432

Closed

timothysc approved these changes Jan 7, 2020

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 7, 2020

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 7, 2020

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 8, 2020

rosti reviewed Jan 9, 2020

View reviewed changes

address comments

7c15e5b

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 22, 2020

fix TOC

130f75f

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 22, 2020

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 24, 2020

k8s-ci-robot merged commit 2367af9 into kubernetes:master Jan 24, 2020

k8s-ci-robot added this to the v1.18 milestone Jan 24, 2020

ereslibre mentioned this pull request Jan 29, 2020

kubeadm: deprecate the ClusterStatus dependency kubernetes/kubernetes#87656

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubeadm: Remove ClusterStatus from kubeadm-config #1380

Kubeadm: Remove ClusterStatus from kubeadm-config #1380

fabriziopandini commented Nov 24, 2019

ereslibre left a comment

ereslibre Nov 25, 2019

fabriziopandini Nov 25, 2019 •

edited

Loading

rosti left a comment

rosti Nov 25, 2019

fabriziopandini Nov 27, 2019 •

edited

Loading

neolit123 Jan 22, 2020

fabriziopandini commented Nov 25, 2019

ereslibre left a comment

timothysc left a comment

timothysc Jan 7, 2020

neolit123 Jan 22, 2020 •

edited

Loading

timothysc Jan 7, 2020

timothysc commented Jan 7, 2020

ereslibre commented Jan 8, 2020

rosti left a comment

fabriziopandini commented Jan 22, 2020

neolit123 commented Jan 22, 2020

k8s-ci-robot commented Jan 22, 2020

neolit123 commented Jan 22, 2020

ereslibre commented Jan 24, 2020


		The `LocalAPIEndpoint` is also used in the stacked `etcd` pod manifest for composing the `peer-urls` and the `client-urls`; the latter is used by kubeadm when accessing etcd in an existing cluster, e.g. when doing `join --control-plane`.

		We are going to echo the `client-urls` value into a new annotation named `kubeadm.kubernetes.io/etcd.advertise-client-urls`. Once the annotation will be in place, it will be possible to easily retrieve the etcd client urls by querying the `etcd` pods.


		We are going to echo the `client-urls` value into a new annotation named `kubeadm.kubernetes.io/etcd.advertise-client-urls`. Once the annotation will be in place, it will be possible to easily retrieve the etcd client urls by querying the `etcd` pods.

		### Risks and Mitigations


		### Test Plan

		No additional test E2E test are required for this change because all the affected behaviors are already covered by existing E2E test.

Kubeadm: Remove ClusterStatus from kubeadm-config #1380

Kubeadm: Remove ClusterStatus from kubeadm-config #1380

Conversation

fabriziopandini commented Nov 24, 2019

ereslibre left a comment

Choose a reason for hiding this comment

ereslibre Nov 25, 2019

Choose a reason for hiding this comment

fabriziopandini Nov 25, 2019 • edited Loading

Choose a reason for hiding this comment

rosti left a comment

Choose a reason for hiding this comment

rosti Nov 25, 2019

Choose a reason for hiding this comment

fabriziopandini Nov 27, 2019 • edited Loading

Choose a reason for hiding this comment

neolit123 Jan 22, 2020

Choose a reason for hiding this comment

fabriziopandini commented Nov 25, 2019

ereslibre left a comment

Choose a reason for hiding this comment

timothysc left a comment

Choose a reason for hiding this comment

timothysc Jan 7, 2020

Choose a reason for hiding this comment

neolit123 Jan 22, 2020 • edited Loading

Choose a reason for hiding this comment

timothysc Jan 7, 2020

Choose a reason for hiding this comment

timothysc commented Jan 7, 2020

ereslibre commented Jan 8, 2020

rosti left a comment

Choose a reason for hiding this comment

fabriziopandini commented Jan 22, 2020

neolit123 commented Jan 22, 2020

k8s-ci-robot commented Jan 22, 2020

neolit123 commented Jan 22, 2020

ereslibre commented Jan 24, 2020

fabriziopandini Nov 25, 2019 •

edited

Loading

fabriziopandini Nov 27, 2019 •

edited

Loading

neolit123 Jan 22, 2020 •

edited

Loading