Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable fine tuning of backoff at shoot level #176

Open
himanshu-kun opened this issue Feb 24, 2023 · 4 comments
Open

Enable fine tuning of backoff at shoot level #176

himanshu-kun opened this issue Feb 24, 2023 · 4 comments
Labels
area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related area/usability Usability related exp/beginner Issue that requires only basic skills kind/enhancement Enhancement, improvement, extension lifecycle/rotten Nobody worked on this for 12 months (final aging stage) needs/planning Needs (more) planning with other MCM maintainers priority/4 Priority (lower number equals higher priority)

Comments

@himanshu-kun
Copy link

himanshu-kun commented Feb 24, 2023

What would you like to be added:
Currently autoscaler provides these flags to fine tune the backoff mechanism
We need to expose these on shoot level , also keeping in mind that these flags are available only from k8s v1.25.0 and above
Currently following flags are exposed.

Following upstream PR added these flags kubernetes#3853

Why is this needed:

For better usability of autoscaler

@himanshu-kun himanshu-kun added area/usability Usability related kind/enhancement Enhancement, improvement, extension area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related priority/4 Priority (lower number equals higher priority) needs/planning Needs (more) planning with other MCM maintainers exp/beginner Issue that requires only basic skills labels Feb 24, 2023
@gardener-robot gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Nov 8, 2023
@ashwani2k
Copy link

Grooming Decision:

No pull for this feature. Hence iceboxing it for now

@gardener-robot gardener-robot added lifecycle/rotten Nobody worked on this for 12 months (final aging stage) and removed lifecycle/stale Nobody worked on this for 6 months (will further age) labels Sep 25, 2024
@dongyingbo
Copy link

dongyingbo commented Oct 21, 2024

What is the relation between those flags and fast backoff from ResourceExhaused feature?

@elankath
Copy link

@dongyingbo InitialNodeGroupBackoffDuration opt set via initial-node-group-backoff-duration is the duration of first backoff after a new node failed to start. It is currently set to 5m and not configurable. MaxNodeGroupBackoffDuration opt set via "max-node-group-backoff-duration is the is the maximum backoff duration for a NodeGroup and is set to 30m. And lastly NodeGroupBackoffResetTimeout opt set via node-group-backoff-reset-timeout is unfortunately set to 3h by default and is currently not configurable.

@elankath
Copy link

elankath commented Oct 23, 2024

Though this fix will likely expedite CA resolution when capacity is added it will not ameliorate the edge case where fallback to alternate node group when capacity is not added. When there are a large number of NodeGroup's managed by the CA, this can be delayed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related area/usability Usability related exp/beginner Issue that requires only basic skills kind/enhancement Enhancement, improvement, extension lifecycle/rotten Nobody worked on this for 12 months (final aging stage) needs/planning Needs (more) planning with other MCM maintainers priority/4 Priority (lower number equals higher priority)
Projects
None yet
Development

No branches or pull requests

5 participants