Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoscaler interfering in meltdown scenario solution #741

Open
himanshu-kun opened this issue Aug 16, 2022 · 4 comments
Open

Autoscaler interfering in meltdown scenario solution #741

himanshu-kun opened this issue Aug 16, 2022 · 4 comments
Assignees
Labels
area/disaster-recovery Disaster recovery related area/high-availability High availability related area/robustness Robustness, reliability, resilience related effort/2w Effort for issue is around 2 weeks kind/bug Bug kind/design lifecycle/rotten Nobody worked on this for 12 months (final aging stage) needs/planning Needs (more) planning with other MCM maintainers priority/2 Priority (lower number equals higher priority)

Comments

@himanshu-kun
Copy link
Contributor

himanshu-kun commented Aug 16, 2022

How to categorize this issue?

/area performance
/kind bug
/priority 2

What happened:
Autoscaler's fixNodeGroupSize logic interferes with meltdown logic where we remove only maxReplacement machines per machinedeployment, and it removes the other Unknown machines as well.

What you expected to happen:
Autoscaler even on taking decision of DecreaseTargetSize should not be able to remove Unknown machines, because the node object is actually present for them.

How to reproduce it (as minimally and precisely as possible):

  • Create a machinedeployment with 2 replicas (its assumed autoscaler is enabled for the cluster)
  • block all traffic to/from the zone machinedeployment is for
  • with default maxReplacement 1 node will stay in Pending state
  • after around 20 min , the Unknown machine would be deleted when autoscaler fixes the node grp size by reducing machinedeployment replicas to 1

Anything else we need to know?:
This is happening because the way machineSet prioritizes machine while deletion based on their status

m := map[v1alpha1.MachinePhase]int{
v1alpha1.MachineTerminating: 0,
v1alpha1.MachineFailed: 1,
v1alpha1.MachineCrashLoopBackOff: 2,
v1alpha1.MachineUnknown: 3,
v1alpha1.MachinePending: 4,
v1alpha1.MachineAvailable: 5,
v1alpha1.MachineRunning: 6,

*We need to look into any other implication of prioritizing Pending machine over Unknown machines for solution.

Environment:

  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • Others:
    CA version 1.23.1
@gardener-robot gardener-robot added area/performance Performance (across all domains, such as control plane, networking, storage, etc.) related priority/2 Priority (lower number equals higher priority) labels Aug 16, 2022
@himanshu-kun
Copy link
Contributor Author

cc @unmarshall

@himanshu-kun
Copy link
Contributor Author

fixNodeGrpSize only understands Registered and Non-Registered nodes, it doesn't do anything even if the node joins and is NotReady for a long time.
Here the intention of fixNodeGrpSize was to remove the Pending machine, but because of our preference in machineSet controller it removes the Unknown machine.
The fixNodeGrpSize currently just acts when the RemoveLongUnregistered logic is not able to remove the longUnregistered nodes because node grp is already at the minimum.
Also if we are not at min. node grp size, the RemoveLongUnregistered logic wouldn't delete the Unknown machine because it uses the priority annotation to pinpoint the machine it wants to delete and as per machineSet preference priority annotation is preffered over machine phase.

@himanshu-kun
Copy link
Contributor Author

Prioritizing Pending machine removal over Unknown would make more sense because:

@himanshu-kun
Copy link
Contributor Author

Also if we are not at min. node grp size, the RemoveLongUnregistered logic wouldn't delete the Unknown machine because it uses the priority annotation to pinpoint the machine it wants to delete and as per machineSet preference priority annotation is preffered over machine phase.

But the RemoveLongUnregistered/RemoveOldUnregistered logic will remove the Pending machine if autoscaler maxNodeProvisionTimeout runs out, thinking it long unregistered. This will again kick in a loop where machine deployment size is reduced and then meltdown logic again turns maxReplacement machines into Pending and finally it'll stop when node grp min size is reached as that time RemoveLongUnregistered logic would stop.

The ideal solution is to make autoscaler aware that the meltdown logic is in play because of an outage in the zone, and it doesn't interfere.

@himanshu-kun himanshu-kun added area/disaster-recovery Disaster recovery related area/robustness Robustness, reliability, resilience related area/high-availability High availability related kind/design needs/planning Needs (more) planning with other MCM maintainers effort/2w Effort for issue is around 2 weeks and removed area/performance Performance (across all domains, such as control plane, networking, storage, etc.) related labels Feb 17, 2023
@himanshu-kun himanshu-kun self-assigned this Mar 31, 2023
@gardener-robot gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Dec 13, 2023
@gardener-robot gardener-robot added lifecycle/rotten Nobody worked on this for 12 months (final aging stage) and removed lifecycle/stale Nobody worked on this for 6 months (will further age) labels Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/disaster-recovery Disaster recovery related area/high-availability High availability related area/robustness Robustness, reliability, resilience related effort/2w Effort for issue is around 2 weeks kind/bug Bug kind/design lifecycle/rotten Nobody worked on this for 12 months (final aging stage) needs/planning Needs (more) planning with other MCM maintainers priority/2 Priority (lower number equals higher priority)
Projects
None yet
Development

No branches or pull requests

2 participants