-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add automaxprocs package #1443
Add automaxprocs package #1443
Conversation
if applied, package automaxprocs will respect the CPU quota on kubernetes cluster Signed-off-by: Hadi Abbasi <hawwwdi@gmail.com>
@hawwwdi source-controller is I/O bound instead of being CPU bound. Did you observed CPU throttling for source-controller in your clusters? |
Yes, I have set the CPU request and limit to 12, but when the source controller uses nearly 6 cores, throttling occurs. Pod monitoring screenshot: |
Are you using HTTP/S HelmRepos that contain a huge number of charts? That would explain the CPU usage. Can you please post here: |
Yes, I've stored many of my Helm charts on a Git repository, and perhaps it's causing high CPU usage. However, it shouldn't be throttled with 6 cores CPU usage when the limit and request are set to 12 cores. The |
The throttle should be only on |
The |
I suggest enabling Helm index caching to see if this reduces the resources usage https://fluxcd.io/flux/installation/configuration/vertical-scaling/#enable-helm-repositories-caching My guess is that those failing HelmCharts are one of the reasons for high CPU usage, if those come from Git then source-controller needs to compile them and it's running continuously as they fail. |
Yes, perhaps enabling this feature or moving Helm charts from Git to a Helm repository will reduce CPU usage. But, this pull request is not intended to solve the high CPU usage problem. Instead, it aims to solve the issue of throttling before reaching the CPU limit. |
I fail to understand how this PR would help you in any way. In your case, you've set 12 CPU as the limit and you've seen throttling at 6 cores. My understanding from automaxprocs docs, is that this defends against throttling when GOMAXPROCS is set above the limits, but in your case is the opposite, it never got to use all 12 cores. |
By the way, I see no issue with using automaxprocs in Flux as it should help reduce throttling on nodes with lots of CPU cores for when users don't bump the 1 core limit we set by default. Besides source-controller, we should add it to:
Could you please build source-controller from this PR and deploy it on your cluster to see if in your case throttling stops? |
According to the golang docs, the default value of |
Yes, I will report the result here. |
I also think that this makes sense, thanks @hawwwdi. We could also use the downward-api e.g. env:
- name: GOMAXPROCS
valueFrom:
resourceFieldRef:
resource: limits.cpu We could save a dependency that way. |
@souleb isn't GOMAXPROCS suppose to be set to an integer value? |
So looks like Go handles well the env var and converts it https://blog.howardjohn.info/posts/gomaxprocs/ We could set both GOMAXPROCS and GOMEMLIMIT in flux2 manifests for all controllers. I will test if it works Ok. |
It's a great way. I will test this method and share the results here. |
yes, it converts it to an Edit: I believe that if the conversion fails, it falls backs to the number of cores. |
I have added the env vars and I've run the Flux benchmarks a couple of times. My conclusion is that is safe to add the env vars to the main distribution as I see no regression in terms of reconciliation speed. @hawwwdi please open a PR in the flux2 repo and add this patch to |
Thank you for testing. Okay, I will open a PR in the flux2 repo with the test results. |
Superseded by: fluxcd/flux2#4717 |
add package
automaxprocs
to automatically setGOMAXPROCS
to match Linux container CPU quota on kubernetes and prevent CPU throttling .Performance results are here.