Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition when joining nodes #72151

Closed
wants to merge 1 commit into from
Closed

Fix race condition when joining nodes #72151

wants to merge 1 commit into from

Conversation

ereslibre
Copy link
Contributor

What type of PR is this?
/kind bug

What this PR does / why we need it:
Despite we were checking for the kubelet kubeconfig file to be present, the
kubelet first writes this file and then the certificates the kubeconfig file
refers to. This represents a race condition in kubeadm in which when we confirm
that the kubelet's kubeconfig file is present we continue creating a clientset
out of it. However, the clientset creation will ensure that the certificates the
kubeconfig file refers to exist on the filesystem.

To fix this problem, not only wait for the kubelet's kubeconfig file to be
present, but also ensure that we can create a clientset ouf of it on our polling
process, while we wait for the kubelet to have performed the TLS bootstrap.

(cherry picked from commit a31c160463d8ed76b6a627e8ed879d72f79d9e08)

Backport of #72030

Which issue(s) this PR fixes:
Fixes kubernetes/kubeadm#1319

Does this PR introduce a user-facing change?:

NONE

Despite we were checking for the kubelet kubeconfig file to be present, the
kubelet first writes this file and then the certificates the kubeconfig file
refers to. This represents a race condition in kubeadm in which when we confirm
that the kubelet's kubeconfig file is present we continue creating a clientset
out of it. However, the clientset creation will ensure that the certificates the
kubeconfig file refers to exist on the filesystem.

To fix this problem, not only wait for the kubelet's kubeconfig file to be
present, but also ensure that we can create a clientset ouf of it on our polling
process, while we wait for the kubelet to have performed the TLS bootstrap.

(cherry picked from commit a31c160463d8ed76b6a627e8ed879d72f79d9e08)
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 18, 2018
@k8s-ci-robot
Copy link
Contributor

@ereslibre: This PR is not for the master branch but does not have the cherry-pick-approved label. Adding the do-not-merge/cherry-pick-not-approved label.

To approve the cherry-pick, please assign the patch release manager for the release branch by writing /assign @username in a comment when ready.

The list of patch release managers for each release can be found here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added do-not-merge/cherry-pick-not-approved Indicates that a PR is not yet approved to merge into a release branch. release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Dec 18, 2018
@k8s-ci-robot
Copy link
Contributor

Hi @ereslibre. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/kubeadm sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 18, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ereslibre
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: timothysc

If they are not already assigned, you can assign the PR to them by writing /assign @timothysc in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ereslibre
Copy link
Contributor Author

Closing due to not being the standard way of performing a cherry pick; just learnt about https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md

@ereslibre ereslibre closed this Dec 18, 2018
@ereslibre ereslibre deleted the fix-race-condition-on-node-join-1.13 branch December 18, 2018 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/cherry-pick-not-approved Indicates that a PR is not yet approved to merge into a release branch. kind/bug Categorizes issue or PR as related to a bug. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants