Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: write "cgroupfs" in the kubelet config #140

Closed

Conversation

neolit123
Copy link
Member

@neolit123 neolit123 commented Jan 25, 2021

Reason for PR:

Cgroup drivers should be ignored on Windows but there is a bug
in the kubelet:
kubernetes/kubernetes#97759
the bug is fixed in master (.21) but is not part of the latest stable releases in the skew yet.

Workaround this bug by patching the kubelet configuration in
the cluster.

Issue Fixed:

fixes #138

Requirements

  • Sqaush commits
  • Documentation
  • Tests

Notes:

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 25, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: neolit123
To complete the pull request process, please assign jsturtevant after the PR has been reviewed.
You can assign the PR to them by writing /assign @jsturtevant in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jan 25, 2021
@github-actions github-actions bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jan 25, 2021
@neolit123
Copy link
Member Author

/assign @jsturtevant

@neolit123
Copy link
Member Author

the bug is fixed in master (.21) but is not part of the latest stable releases in the skew yet.

we should have backports for this:
kubernetes/kubernetes#97764 (comment)

@neolit123
Copy link
Member Author

/kind failing-test
/priority critical-urgent

@k8s-ci-robot k8s-ci-robot added kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Jan 25, 2021
Cgroup drivers should be ignored on Windows but there is a bug
in the kubelet:
  kubernetes/kubernetes #97759

Workaround this bug by patching the kubelet configuration in
the cluster.
@neolit123 neolit123 force-pushed the 1.21-fix-cgroup-drivers branch from 385cfb9 to 49a690c Compare January 25, 2021 19:54
@neolit123 neolit123 changed the title kubeadm: make the Windows workers use "cgroupfs" driver kubeadm: write "cgroupfs" in the kubelet config Jan 25, 2021
@jsturtevant
Copy link
Contributor

I am not super familiar with this setup. It looks like the cherrypicks went in for older versions: kubernetes/kubernetes#98384

Should we prefer those over the patch?

@neolit123
Copy link
Member Author

@jsturtevant

i've sent the PR after investigating the failures and so that we don't delete the job due to not being maintained.
cc @jayunit100

these scripts were originally created by benmoss with some help from me, but ideally sig-windows should take ownership.
your observation about the cherry picks is correct - once a new k8s PATCH release is out for the latest stable (1.20) this setup should no longer fail and this patch will not be needed.

we missed v1.20.2 for the cherry pick, so it will be out with v1.20.3, maybe in ~2 weeks.
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 28, 2021
@jsturtevant
Copy link
Contributor

Sounds good. FWIW we have additional coverage for windows Kubeadm over in capz: https://testgrid.k8s.io/sig-cluster-lifecycle-cluster-api-provider-azure#periodic-capi-e2e-full

It's not as extensive (I am working on that) but it does validate that kubeadm binary works. I'll need to make sure we get the correct fix there too.

@neolit123
Copy link
Member Author

neolit123 commented Jan 28, 2021

given capz is more actively maintained than the CI scripts in this repository, we can remove them and if that is what sig-windows would prefer.

@neolit123
Copy link
Member Author

neolit123 commented Feb 18, 2021

@neolit123
Copy link
Member Author

the job deployment is now working:
https://k8s-testgrid.appspot.com/sig-windows-releases#kubeadm-windows-gcp-k8s-stable

some conformance test appear to be red, which is not deployer related.

/close

@k8s-ci-robot
Copy link
Contributor

@neolit123: Closed this PR.

In response to this:

the job deployment is now working:
https://k8s-testgrid.appspot.com/sig-windows-releases#kubeadm-windows-gcp-k8s-stable

some conformance test appear to be red, which is not deployer related.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubeadm job on GCE is failing
3 participants