Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list pods assigned to a node by pod informer cache #673

Merged
merged 3 commits into from
Dec 14, 2021

Conversation

Garrybest
Copy link
Member

@Garrybest Garrybest commented Dec 11, 2021

What this PR does / why we need it:
Now there is a perf problem about the way we list pods that assigned to a node. When the cluster is large scale, like 3k+ nodes and 90k+ pods. The descheduler is not able to work because every interval it will request kube-apiserver for almost 30k+ times(enable 10 plugins and every plugins request for 3k+ times) with a FieldSelector. Meanwhile, it leads to a lot of pressure for kube-apiserver. So I try to make a revision by using pod informer cache.

Which issue(s) this PR fixes:
Fixes #671

Special notes for your reviewer:
I'm afraid the PR is too large to review (beacause listing pods is a really basic action and the function is called everywhere in strategy files and test files). So I separate this PR into 3 commits in order to make code review easier.😄

  1. Reform ListPodsOnANode by using pod informer and indexer
    This commit is the most important one. The code is very concise. I establish an indexer to map the pods and their assigned nodes by an informer indexer. So every time we list pods we could use informer cache instead of requesting kube-apiserver by filed selectors.

  2. Reform all strategies by using getPodsAssignedToNode
    This commit let all other files of descheduler adapt to listing pods by function getPodsAssignedToNode. Not so important, just some adaptions.

  3. Reform all test files
    This commit is not important. Just a lot of adaptions of test files and it took me a lot of time to do the adaptions (cause the test cases and strategies are so many). 🤣

Signed-off-by: Garrybest <garrybest@foxmail.com>
Signed-off-by: Garrybest <garrybest@foxmail.com>
Signed-off-by: Garrybest <garrybest@foxmail.com>
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 11, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @Garrybest!

It looks like this is your first PR to kubernetes-sigs/descheduler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/descheduler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @Garrybest. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 11, 2021
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Dec 11, 2021
Copy link
Contributor

@damemi damemi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 13, 2021
@ingvagabund
Copy link
Contributor

@Garrybest you were faster sir :) #674

@ingvagabund
Copy link
Contributor

@Garrybest I need this change merged pretty soon. I have another PR on the way that relies on the pod lister and removing the field selector. What is your availability?

@Garrybest
Copy link
Member Author

What is your availability?

Hi @ingvagabund, I'm totally available recently and I'm willing to make cooperations.😄

By the way, according to what @damemi said in #671 (comment), I have some experimental results and facts to show.

  1. I have used an informer indexer, which makes the code simple and concise.

According to the first commit of my PR, I build a GetPodsAssignedToNodeFunc to provide a method to get pods that assigned to a node. The code is very simple, just establish a inform indexer between pods and assgined nodes.

  1. Using filter function to replace the field selector.

Actually, the field selector is a special kind of pod filter. So I aggregate the origin incoming filter function and our filed selector filter function as a final pod filter in ListPodsOnANode. So actually the field selector still works but as another form.

  1. Some test result in my own clusters of production environment.

Test method: Run ListPodsOnANode without any extra filter functions looped by all ready nodes and then measure the elapsed time.
Test machine: A cloud VM with 6 cpu cores and 9.4GB memory.

  • A cluster with 233 nodes and 1933 assigned pods.
    • Before: 96s
    • After: 0.2s
    • Accelerate rate: 480 times
  • A cluster with 5235 nodes and 42477 assigned pods.
    • Before: More than 30min
    • After: 3.36s
    • Accelerate rate: more than 535 times

I think the benefit of this PR is very obvious. Note that this test is only listing once. If we enable all plugins, the accelerate rate will be a very large number since the listing of all nodes will be called more than once.

@ingvagabund
Copy link
Contributor

This PR is well done!!!

@Garrybest thank you. I like the changes you made to the code.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 14, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Garrybest, ingvagabund

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 14, 2021
@ingvagabund ingvagabund mentioned this pull request Dec 14, 2021
@k8s-ci-robot k8s-ci-robot merged commit 0b2c10d into kubernetes-sigs:master Dec 14, 2021
@Garrybest Garrybest deleted the pr_pod_cache branch December 14, 2021 09:37
briend pushed a commit to briend/descheduler that referenced this pull request Feb 11, 2022
list pods assigned to a node by pod informer cache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too slow to list pods when cluster is huge
4 participants