-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to force docker to use systemd as cgroup manager #7815
Conversation
Since minikube is running systemd, kubeadm expects kubeadm to be the cgroup manager. If docker is using a different cgroup manager like cgroupfs, this can cause unstable resource allocation. We were seeing this in Cloud Shell, and forcing docker to use systemd resolved the issue.
/ok-to-test |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: priyawadhwa The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov Report
@@ Coverage Diff @@
## master #7815 +/- ##
==========================================
- Coverage 35.50% 35.45% -0.06%
==========================================
Files 148 148
Lines 9330 9347 +17
==========================================
+ Hits 3313 3314 +1
- Misses 5620 5635 +15
- Partials 397 398 +1
|
kvm2 Driver Times for Minikube (PR 7815): [27.389023917 26.465480006 26.028746423999998] Averages Time Per Log
|
/retest |
Docker has some weird reason for not using this by default:
But like you say, Kubernetes recommends using the same as host.
|
@afbjorklund yah there's definitely a disparity. Since the goal here is to run k8s, I feel like we should stick to what they are recommending, especially if tests pass. If we start to see issues around this, we could put it behind a flag so that only users who need it can use it. |
kvm2 Driver Times for Minikube (PR 7815): [26.787263245000002 27.170312600000003 26.924779744000002] Averages Time Per Log
|
It looks like the github actions environment doesn't like this --
I'm still not totally sure what's causing some environments to only work with systemd (Cloud Shell) and some to totally fail with systemd (Github Actions). I'm thinking of putting this behind a flag for now, WDYT @afbjorklund @medyagh ? |
We already noticed that GitHub Actions is running a totally different i.e. they have their own fork of |
Ouch, I just noticed there is a podman and crio upgrade in here as well... |
@afbjorklund yah the Dockerfile build is failing with "version cannot be found" errors for both crio and podman, so they need to be updated for the v0.0.10 image to be built :/ |
kvm2 Driver Times for Minikube (PR 7815): [67.377331953 66.864556251 66.00641521200001] Averages Time Per Log
docker Driver Times for Minikube (PR 7815): [28.445735458000005 27.601543418000002 27.508156965] Averages Time Per Log
|
kvm2 Driver Times for Minikube (PR 7815): [65.409322369 66.154082117 65.688193337] Averages Time Per Log
docker Driver Times for Minikube (PR 7815): [28.171486896 26.395919186000004 28.095956455] Averages Time Per Log
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one last nit: ForceSystemd() should be private, as it is not being called anywhere outside
kvm2 Driver Times for Minikube (PR 7815): [67.919457965 64.93236012999999 66.66688270899999] Averages Time Per Log
docker Driver Times for Minikube (PR 7815): [28.451083855000004 28.723881671 27.129281009] Averages Time Per Log
|
/retest-this-please |
KVM tests look very scary https://storage.googleapis.com/minikube-builds/logs/7815/52b5737/KVM_Linux.html |
kvm2 Driver Times for Minikube (PR 7815): [66.619740621 68.860728491 66.752424976] Averages Time Per Log
docker Driver Times for Minikube (PR 7815): [27.155278621 28.76059046 27.136706439999998] Averages Time Per Log
|
@medyagh the KVM failures are normal now |
kvm2 Driver Times for Minikube (PR 7815): [66.634958373 64.80708744 64.223911174] Averages Time Per Log
docker Driver Times for Minikube (PR 7815): [28.562015612 27.303079212999997 26.302191173] Averages Time Per Log
|
kvm2 Driver Times for Minikube (PR 7815): [64.79517546 65.32403882700001 65.487059282] Averages Time Per Log
docker Driver Times for Minikube (PR 7815): [26.967782229 26.816947358999997 28.15798285] Averages Time Per Log
|
@priyawadhwa could it be because we need to make the docker on all the nodes use systemd ? or is it a coinsidence? |
/retest-this-please |
1 similar comment
/retest-this-please |
@medyagh this seems to be a coincidence. It also doesn't seem like the tests reran, so I'll merge w/ master and we can rerun them |
kvm2 Driver Times for Minikube (PR 7815): [65.52541886099999 62.54227959400001 65.857554524] Averages Time Per Log
docker Driver Times for Minikube (PR 7815): [27.149376945999997 27.420185357 27.484454285] Averages Time Per Log
|
sounds good. please make sure there is a comment explaining every failure associated with an issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved with required comments for integration tests failure associated with an issue
TestErrorSpam - 198.53s TestPreload - 214.58s TestStartStop/group/containerd - 273.53s TestStartStop/group/crio - 1000.61s TestStartStop/group/embed-certs - 844.77s
Opened #7921 TestFunctional/parallel/StatusCmd - 5.23s
Opened #7922 TestMultiNode/serial/DeleteNode - 5.54s
Seems related to #7922, as it's also a |
Comments no longer applicable as this is now behind a flag
Since minikube is running systemd, kubeadm expects kubeadm to be the cgroup manager. If docker is using a different cgroup manager like cgroupfs, this can cause unstable resource allocation. We were seeing this in Cloud Shell, and forcing docker to use systemd resolved the issue.
This k8s doc provides good context: https://kubernetes.io/docs/setup/production-environment/container-runtimes
I modified the kic base image to include the docker daemon config which forces systemd. I upgraded kic to v0.0.10. This file didn't exist before, so it isn't being overwritten.
I elected to go with
--force-systemd
instead of--cgroup-manager[cgroupfs,systemd]
since the latter would introduce more complexity than needed. Crio defaults to systemd while docker/containerd default to cgroupfs, and we'd need to keep track of acceptable combos with the latter option.Fixes #6404