-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large LIST calls being made to Kube API server #10931
Comments
Could you paste one of the the complete API endpoint URL? Do you have user agent information for those list calls? |
RequestURI - user agent- |
Workaround is to remove your liveness probe or reduce the |
Thanks so if I understand correctly, |
Yes. It should reduce the load. |
Thanks for confirming, I am double checking with our etcd team because there has been some discussion around how |
I experienced a similar issue with workflow controller, except it was doing large LIST requests for Pods. It seems like workflow controller is issuing periodic list requests without setting resourceVersion and with a labelSelector, which requires apiserver to fetch objects directly from etcd instead of using it's in-memory cache, generating a lot of heavy load on apiserver and etcd. Ideally these sort of requests can use controller list/watch pattern instead of doing periodic lists, Google has some documentation around this here: https://cloud.google.com/kubernetes-engine/docs/concepts/planning-scalability#use_list_and_watch_pattern_instead_of_periodic_listing Here's an example LIST request that was generating a lot of load, I redacted some fields that aren't relevant.
|
Dug around and see that this issue has been fixed already! #4024 I believe the version of workflow controller being used for this cluster did not include this performance improvement |
Would you like to try a potential fix in this new image tag |
@prateekgogia did u retest? |
Summary
While debugging the load on API server and etcd instances, I found that argo workflow controller is making List calls every 1 minute and listing all the workflow objects in the clusters
What change needs making?
Can this implementation be switched to a WATCH call instead of using a List call?
Use Cases
When would you use this?
Message from the maintainers:
Love this enhancement proposal? Give it a 👍. We prioritise the proposals with the most 👍.
The text was updated successfully, but these errors were encountered: