Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPA for validator/admission component #234

Merged
merged 2 commits into from
Feb 12, 2021

Conversation

rfranzke
Copy link
Member

@rfranzke rfranzke commented Feb 5, 2021

How to categorize this PR?

/area scalability auto-scaling
/kind enhancement
/priority normal
/platform gcp

What this PR does / why we need it:
VPA for validator/admission component

Release note:

The validator/admission component's Helm chart is now deploying a `VerticalPodAutoscaler` resource by default. If undesired or no VPA is available in the garden cluster then it can be turned of via `.Values.global.vpa.enabled=false`.

@rfranzke rfranzke requested review from a team as code owners February 5, 2021 09:34
@gardener-robot gardener-robot added area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related area/scalability Scalability related kind/enhancement Enhancement, improvement, extension platform/gcp Google cloud platform/infrastructure priority/normal needs/review Needs review size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Feb 5, 2021
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 5, 2021
@ialidzhikov
Copy link
Member

What changed after #141?

@gardener-robot-ci-2 gardener-robot-ci-2 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 5, 2021
@rfranzke
Copy link
Member Author

rfranzke commented Feb 5, 2021

Well, you know best after you had to manually increase the resource limits yesterday in one of the landscapes, right 😉

From #141 (comment):

I'm mainly worried about downscaling actions performed by VPA. In operations, we observed cases for which a proper upscaling didn't happen after downscaling and in order to recover we needed to delete the VPA object.

Hm, true, OTOH, we use VPA for basically everything and run thousands of clusters, so not sure if this component now is so special. Also, this statement is somewhat older and VPA became better in the meantime. Anyways, the feature is configurable, so if some landscape operator does not like it or it causes too much trouble then it can still be disabled and the old behaviour is preserved. Generally, offering VPA (and also enabling it by default) makes sense.

The admission component directly affects the availability of the Shoot API and thus I'm a bit critical.

True, however, it's more critical if the statically configured resource requirements are not enough anymore and someone gets paged during the night for an incident that could have also auto-resolved via VPA.

/cc @timuthy

@timuthy
Copy link
Member

timuthy commented Feb 5, 2021

Can we combine this change with a secret sync on webhook start? Then we'd at least avoid a resource spike when the webhooks gets the first request after a while.

@rfranzke
Copy link
Member Author

rfranzke commented Feb 5, 2021

@timuthy what do you have in mind? Simply listing all secrets on start-up?

@timuthy
Copy link
Member

timuthy commented Feb 5, 2021

@timuthy what do you have in mind? Simply listing all secrets on start-up?

Yes, I think this will do the trick.

@rfranzke
Copy link
Member Author

rfranzke commented Feb 8, 2021

@timuthy Here you go: 2645570

/squash

@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 8, 2021
@gardener-robot gardener-robot added the merge/squash Should be merged via 'Squash and merge' label Feb 8, 2021
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 8, 2021
prashanth26
prashanth26 previously approved these changes Feb 8, 2021
Copy link

@prashanth26 prashanth26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Thanks for this.

@gardener-robot gardener-robot added reviewed/lgtm Has approval for merging and removed needs/review Needs review labels Feb 8, 2021
@gardener-robot gardener-robot added needs/review Needs review and removed needs/review Needs review labels Feb 12, 2021
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 12, 2021
Copy link
Member

@vpnachev vpnachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@vpnachev vpnachev merged commit 33a332b into gardener:master Feb 12, 2021
@rfranzke rfranzke deleted the enh/vpa-admission branch February 15, 2021 07:14
@gardener-robot gardener-robot added priority/3 Priority (lower number equals higher priority) and removed priority/3 Priority (lower number equals higher priority) labels Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related area/scalability Scalability related kind/enhancement Enhancement, improvement, extension merge/squash Should be merged via 'Squash and merge' needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) platform/gcp Google cloud platform/infrastructure reviewed/lgtm Has approval for merging size/s Size of pull request is small (see gardener-robot robot/bots/size.py)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants