Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with high memory consumption on kube-apiserver #149

Open
MacThrawn opened this issue May 4, 2023 · 2 comments
Open

Problem with high memory consumption on kube-apiserver #149

MacThrawn opened this issue May 4, 2023 · 2 comments

Comments

@MacThrawn
Copy link

Hello,

We would like to install the Cert Utils Operator on OpenShift via OperatorHub. The OpenShift version we are using is 4.11.27 (Kubernetes: v1.24.6+263df15). The version of Cert Utils Operator is 1.3.10.
After the installation of the Cert Utils Operator we see increasing memory usage in kube-apiservers:

kube-apiserver

As you can see the memory gets filled up at the first kube-apiserver and after this gets unavailable the next one is filled up and so on.

The OpenShift Cluster has 164 worker nodes and 3 "etcd" Nodes. Each of the etcd Nodes where the kube-apiserver is running has 12 vCores and 64 GB of memory.
This cluster hosts several namespaces and applications which results in 12644 secrets and 5655 configmaps.

We stopped the attempt to install the Cert Utils Operator after two and a half hours because the cluster gets very unstable and it doesn't look like it will be healing soon.
After uninstalling the Cert Utils Operator the Cluster gets back to normal operations and stability.

We also tested the installation on a smaller cluster with same versions and here we saw also for some time increased memory consumption on the kube-apiserver but after some minutes the kube-apiserver gets back to normal operations. We also see a high CPU consumption (up to 2 cores) of the Cert Utils Operator during this time.

The described behavior is every time reproduceable but it looks like it depends on size of the cluster/configmaps/secrets.
We are running several other operators in the cluster but none of them shows a similar behavior.

For me it is not clear whether the Cert Utils Operator does something wrong or this is an issue with the large quantity of secrets and configmaps and inefficient processing in kube-apiserver?

@raffaelespazzoli
Copy link
Contributor

when cert-utils starts it creates watch on secret and configmaps, so when you have ton of them it will generate significant load on the kube api servers. But that should be only at the beginning. Then the load should subside. Unless your secrets are constantly changing.
We will do some investigation on this on our side.

@ocpvkb
Copy link

ocpvkb commented Jul 5, 2023

Hello everyone,
I can confirm the behavior if the cert-utils operator changes configmaps and secrets, which are created or managed by other operators in the cluster.
Here is exactly the same problem: #120 or #127
The loops result in massive amounts of API calls....
And this is not an Openshift specific problem....
This is also not a problem or effect from a specific cluster size - here it only becomes significantly visible....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants