[VPA] Configurable upper and lower bounds for memory and cpu recommendations #6660

emla9 · 2024-03-26T17:01:38Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Via flags on the recommender, make memory target percentile and upper/lower bounds for memory and CPU configurable. This allows the target percentile for memory and CPU to be set higher than 0.95 in order to accommodate workloads with significant outlier pods, such as some daemonsets.

Which issue(s) this PR fixes:

Fixes #6420

Does this PR introduce a user-facing change?

Added flags to specify targetMemoryPercentile, lowerBoundMemoryPercentile, upperBoundMemoryPercentile, lowerBoundCPUPercentile, and upperBoundCPUPercentile via command line arguments.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [Usage]: https://github.com/kubernetes/autoscaler/blob/f4c1ca1e38b19df784c360eea4390adbc790d2d9/vertical-pod-autoscaler/FAQ.md#what-are-the-parameters-to-vpa-recommender

linux-foundation-easycla · 2024-03-26T17:01:43Z

✅login: emla9 / (8a705f7)
✅login: emla9 / (8a705f7, 3458b79)
✅login: emla9 / (8a705f7, 3458b79, 478d3e0)
✅login: emla9 / (8a705f7, 3458b79, 478d3e0, 5aaeb83)
✅login: emla9 / (8a705f7, 3458b79, 478d3e0, 5aaeb83, cdb8269)

The committers listed above are authorized under a signed CLA.

k8s-ci-robot · 2024-03-26T17:01:47Z

Welcome @emla9!

It looks like this is your first PR to kubernetes/autoscaler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/autoscaler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2024-03-26T17:01:48Z

Hi @emla9. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

… into configure-memory-target

Shubham82 · 2024-03-27T05:10:13Z

Hi @emla9
you have to sign the CLA before the PR can be reviewed.
See the following document to sign the CLA: Signing Contributor License Agreements(CLA)

Shubham82 · 2024-03-27T05:10:42Z

To check EasyCLA

/easycla

dbenque

LGTM

voelzmo · 2024-04-30T08:54:04Z

/lgtm

Thanks!

@emla9: While I appreciate making these things configurable, I'm wondering if this change really fixed the issue you were describing in #6420? I don't think setting a targetMemoryPercentile of 1 should have fixed this, given that memory samples are stored as the maximum per day and this problem seems to arise already on the first day – meaning no samples should be discarded based on the targetPercentile.
If you were seeing this only after the workload ran for about a week, I would understand how increasing the targetMemoryPercentile fixes this.

emla9 · 2024-05-02T00:15:38Z

Thanks for the review!

@voelzmo, I retested this with vpa-stress on one of our clusters: setting --target-memory-percentile=1 and --recommendation-upper-bound-memory-percentile=1 does fix the issue.

My understanding was that each container instance's max per day gets added to the aggregate histogram and then the 90th percentile of that is used to calculate the target recommendation, which is different from adding only the max per day across all container instances in the workload.

voelzmo · 2024-05-06T14:01:15Z

Ah, thanks, I think I understand how setting the target-memory-percentile to 1 will help solving this issue – the high memory measurements caused by the OOMKill events will always be discarded when selecting samples according to a percentile lower than 1.

Thanks!

voelzmo · 2024-05-06T14:22:46Z

@kwiesmueller This is good to go from my side, could you please /approve?

kwiesmueller · 2024-05-08T18:09:05Z

/lgtm
/approve

k8s-ci-robot · 2024-05-08T18:09:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: emla9, kwiesmueller

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~vertical-pod-autoscaler/OWNERS~~ [kwiesmueller]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

prashantkalkar · 2024-07-17T15:41:10Z

When will this be available? Unless this is release, overriding the target-cpu-percentile seems to be of now use (can't set it to higher than 0.95 if required).

emla9 and others added 3 commits February 29, 2024 20:39

Make upper and lower bounds configurable for memory and cpu

8a705f7

Merge branch 'kubernetes:master' into configure-memory-target

3458b79

Merge branch 'kubernetes:master' into configure-memory-target

478d3e0

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 26, 2024

k8s-ci-robot added the area/vertical-pod-autoscaler label Mar 26, 2024

k8s-ci-robot requested review from jbartosik and voelzmo March 26, 2024 17:01

k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Mar 26, 2024

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 26, 2024

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 26, 2024

emla9 added 2 commits March 26, 2024 13:36

update FAQ with recommender target/upper/lower parameters

5aaeb83

Merge branch 'configure-memory-target' of github.com:emla9/autoscaler…

cdb8269

… into configure-memory-target

emla9 force-pushed the configure-memory-target branch from 0f71fa8 to cdb8269 Compare March 26, 2024 17:36

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Apr 11, 2024

dbenque reviewed Apr 29, 2024

View reviewed changes

k8s-ci-robot assigned voelzmo Apr 30, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 30, 2024

voelzmo mentioned this pull request May 6, 2024

[VPA] Does not respond to OOM for workloads with non-uniform resource utilization #6420

Closed

voelzmo mentioned this pull request May 8, 2024

Make VPA recommender target memory percentile configurable #6700

Closed

k8s-ci-robot assigned kwiesmueller May 8, 2024

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2024

k8s-ci-robot merged commit 44b128f into kubernetes:master May 8, 2024
6 checks passed

ialidzhikov mentioned this pull request Jul 29, 2024

Make VPA more configurable to improve utilisation of VPA-dependent workload gardener/gardener#10141

Closed

raywainman mentioned this pull request Aug 6, 2024

Release VPA 1.2.0 #7098

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VPA] Configurable upper and lower bounds for memory and cpu recommendations #6660

[VPA] Configurable upper and lower bounds for memory and cpu recommendations #6660

emla9 commented Mar 26, 2024

linux-foundation-easycla bot commented Mar 26, 2024 •

edited

Loading

k8s-ci-robot commented Mar 26, 2024

k8s-ci-robot commented Mar 26, 2024

Shubham82 commented Mar 27, 2024

Shubham82 commented Mar 27, 2024

dbenque left a comment

voelzmo commented Apr 30, 2024

emla9 commented May 2, 2024

voelzmo commented May 6, 2024

voelzmo commented May 6, 2024

kwiesmueller commented May 8, 2024

k8s-ci-robot commented May 8, 2024

prashantkalkar commented Jul 17, 2024

[VPA] Configurable upper and lower bounds for memory and cpu recommendations #6660

[VPA] Configurable upper and lower bounds for memory and cpu recommendations #6660

Conversation

emla9 commented Mar 26, 2024

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

linux-foundation-easycla bot commented Mar 26, 2024 • edited Loading

k8s-ci-robot commented Mar 26, 2024

k8s-ci-robot commented Mar 26, 2024

Shubham82 commented Mar 27, 2024

Shubham82 commented Mar 27, 2024

dbenque left a comment

Choose a reason for hiding this comment

voelzmo commented Apr 30, 2024

emla9 commented May 2, 2024

voelzmo commented May 6, 2024

voelzmo commented May 6, 2024

kwiesmueller commented May 8, 2024

k8s-ci-robot commented May 8, 2024

prashantkalkar commented Jul 17, 2024

linux-foundation-easycla bot commented Mar 26, 2024 •

edited

Loading