-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubeadm: Remove ClusterStatus from kubeadm-config #1380
Kubeadm: Remove ClusterStatus from kubeadm-config #1380
Conversation
keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md
Show resolved
Hide resolved
keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md
Outdated
Show resolved
Hide resolved
keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like removing shared state.
I guess one of the motivations for this KEP is to improve the multiple control plane join at once case. Is this correct? Removing shared state is nice, but I think it's harder to block on a known shared resource (as we could with a shared configmap -- with leader election).
Here, when a control plane node joins, it will have to:
- List nodes, retrieve annotations
- Perform the etcd join with the list of current endpoints
- Let the
kubelet
register with TLS bootstrap - Annotate node
I see potential race conditions if several control planes are joining at the same time, and I believe it's harder to sync on several annotations on different resources (nodes), than to sync on a shared config map. What do you think about this? Should it be mentioned under risks?
keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md
Outdated
Show resolved
Hide resolved
|
||
The `LocalAPIEndpoint` is also used in the stacked `etcd` pod manifest for composing the `peer-urls` and the `client-urls`; the latter is used by kubeadm when accessing etcd in an existing cluster, e.g. when doing `join --control-plane`. | ||
|
||
We are going to echo the `client-urls` value into a new annotation named `kubeadm.kubernetes.io/etcd.advertise-client-urls`. Once the annotation will be in place, it will be possible to easily retrieve the etcd client urls by querying the `etcd` pods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, we are composing this information with:
func GetClientURL(localEndpoint *kubeadmapi.APIEndpoint) string {
return "https://" + net.JoinHostPort(localEndpoint.AdvertiseAddress, strconv.Itoa(constants.EtcdListenClientPort))
}
So, we are today setting etcd's advertise-client-urls
forcing the port to the etcd listen client one, reusing the current local endpoint advertise address.
What's the goal of this new annotation as opposed to kubeadm.kubernetes.io/kube-apiserver.advertise-address
? Couldn't we theoretically compose the advertise-client-urls
based on kubeadm.kubernetes.io/kube-apiserver.advertise-address
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal is to build the list of the etcd peers looking at the current etcd pods (instead of looking at the kube-apiserver pods, which can be somehow confusing or eventually also error-prone)
This is especially helpful in the current join --control-plane sequence because the kube-apiserver and the etcd static pods are created in two separated moments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fabriziopandini !
I like the idea very much! However, we must also ensure, that we don't break anyone who's relying on ClusterStatus
-es.
keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md
Outdated
Show resolved
Hide resolved
keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md
Outdated
Show resolved
Hide resolved
During upgrades: | ||
|
||
- The new annotations `kubeadm.kubernetes.io/kube-apiserver.advertise-address` and `kubeadm.kubernetes.io/etcd.advertise-client-urls` will be generated during the upgrade of the static pod manifests. | ||
- The `ClusterStatus` entry will be cleaned up during the upgrade of the `kubeadm-config` ConfigMap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to guarantee that ClusterStatus
-es are kept together with annotations for some time (they can be considered beta, so at least 9 months/3 cycles) not to break people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see [1]
The whole point of this KEP is that it is not possible to keep ClusterStatus up to date with the current status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we discussed during kubeadm office hours that we would like to keep the ClusterStatus for a while.
@fabriziopandini please confirm. does this section needs a minor update?
I think that there is some misunderstanding here. The join process will be:
Considering that I don't see potential race conditions, but happy to discuss this if this is not clear yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the clarifications @fabriziopandini. This LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it. Sorry for my late response, pre-holidays was nuts.
I have some minor questions but I'm unblocking approval.
/approve
keps/sig-cluster-lifecycle/kubeadm/20191125-remove-clusterstatus-from-kubeadm-config.md
Outdated
Show resolved
Hide resolved
|
||
We are going to echo the `client-urls` value into a new annotation named `kubeadm.kubernetes.io/etcd.advertise-client-urls`. Once the annotation will be in place, it will be possible to easily retrieve the etcd client urls by querying the `etcd` pods. | ||
|
||
### Risks and Mitigations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not checked, but if a static pod goes down is that always removed from the pod list? Also LBs would still need to update periodically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the static manifest is removed the mirror Pod should be removed too.
but if the manifest is in place the kubelet will try to create the Pod. it would then be possible do kubectl describe po
on it to get the annotation.
|
||
### Test Plan | ||
|
||
No additional test E2E test are required for this change because all the affected behaviors are already covered by existing E2E test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we do a destructive LB update test?
/hold |
/lgtm Feel free to unhold. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments were non-blocking too.
/lgtm
@neolit123 @timothysc all the comments are addressed, please PTAL |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ereslibre, fabriziopandini, neolit123, rosti, timothysc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
please "hold cancel" before Tuesday next week. |
Tim's comments were non blocking and there's consensus on this KEP. Thank you everyone! /hold cancel |
This KEP is proposing a new mode for tracking the list of the API endpoints in a cluster, thus allowing to remove the
ClusterStatus
entry in thekubeadm-config
ConfigMap and solve the problems that arise when, for any reasons, such entry does not reflect anymore the real status of the cluster./sig cluster-lifecycle
/assign @neolit123 @rosti @ereslibre @ncdc @timothysc