Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProvisioningRequest booked capacity is scaled down (and shouldn't be) #6517

Closed
yaroslava-serdiuk opened this issue Feb 8, 2024 · 5 comments · Fixed by #6880
Closed

ProvisioningRequest booked capacity is scaled down (and shouldn't be) #6517

yaroslava-serdiuk opened this issue Feb 8, 2024 · 5 comments · Fixed by #6880
Assignees
Labels
area/cluster-autoscaler area/core-autoscaler Denotes an issue that is related to the core autoscaler and is not specific to any provider. kind/feature Categorizes issue or PR as related to a new feature.

Comments

@yaroslava-serdiuk
Copy link
Contributor

yaroslava-serdiuk commented Feb 8, 2024

Which component are you using?:
cluster-autoscaler

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:
When the ProvisioningRequst is Provisioned for check-capacity Provisioning class, ClusterAutoscaler reserve this capacity from other ProvisioningRequest for short period of time in order to provide accurate CA decision for multiple ProvisioningRequests.
However currently this capacity is not blocked from scale down, so we might have a situation when CA marked ProvisioningRequest as Provisioned and then scaled down the capacity, so ProvisioningRequest won't have capacity anymore.

Describe the solution you'd like:
In order to not scale down the booked capacity I suggest to:

  • add booking capacity mechanism to scale down as well (i.e schedule in-memory pods for Provisioned ProvisioningRequests in to the clusterSnapshot). However, each loop pods could be scheduled on different nodes, preventing them accumulate unneeded time for scale down, so
  • ignore in-memory created ProvisioningRequest pods when calculate node's utilisation. So scale down simulation will try to schedule ProvisioningRequest pods on different node.

Additional context:
Booking capacity is not limited only for check-capacity ProvisioningClass, this mechanism will be used for atomicScaleUp ProvisioningClass as well.

@yaroslava-serdiuk yaroslava-serdiuk added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 8, 2024
@yaroslava-serdiuk
Copy link
Contributor Author

/cc @kisieland @mwielgus @x13n

@yaroslava-serdiuk yaroslava-serdiuk changed the title ProvisioningRequest booked capacity is not scaled down ProvisioningRequest booked capacity is scaled down (and shouldn't be) Feb 8, 2024
@x13n
Copy link
Member

x13n commented Feb 9, 2024

I think hinting should be sufficient to stabilize fake pod -> node assignments as long as fake pod uids don't change between loops:

func (s *HintingSimulator) findNodeWithHints(clusterSnapshot clustersnapshot.ClusterSnapshot, pod *apiv1.Pod, isNodeAcceptable func(*schedulerframework.NodeInfo) bool) (string, error) {
hk := HintKeyFromPod(pod)
if hintedNode, hasHint := s.hints.Get(hk); hasHint {
if err := s.predicateChecker.CheckPredicates(clusterSnapshot, pod, hintedNode); err == nil {
s.hints.Set(hk, hintedNode)
nodeInfo, err := clusterSnapshot.NodeInfos().Get(hintedNode)
if err != nil {
return "", err
}
if isNodeAcceptable(nodeInfo) {
return hintedNode, nil
}
}
}
return "", nil
}

I'm not sure I understand this:

ignore in-memory created ProvisioningRequest pods when calculate node's utilisation. So scale down simulation will try to schedule ProvisioningRequest pods on different node.

If you schedule PR pods into the snapshot, they should already be considered in scale down simulation (and potentially moved to another node). What do you want to gain by additionally lowering utilization of such nodes?

@yaroslava-serdiuk
Copy link
Contributor Author

Thanks!

If you schedule PR pods into the snapshot, they should already be considered in scale down simulation (and potentially moved to another node). What do you want to gain by additionally lowering utilisation of such nodes?

It's possible to make nodes not underutilised by scheduling fake pods and so block them from scale down, if utilisation threshold is low.
But probably you're right, this is not needed. The user should set the appropriate utilisation threshold.

@x13n
Copy link
Member

x13n commented Feb 12, 2024

It's possible to make nodes not underutilised by scheduling fake pods and so block them from scale down, if utilisation threshold is low.

Isn't this fine? Scale down will make sure that after nodes are removed, there's still enough capacity for all the pods (including fake ones). Utilization threshold is really only a heuristic for scale down to check if a given node should be considered in scale down simulations at all.

@towca towca added area/cluster-autoscaler area/core-autoscaler Denotes an issue that is related to the core autoscaler and is not specific to any provider. labels Mar 21, 2024
@yaroslava-serdiuk
Copy link
Contributor Author

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler area/core-autoscaler Denotes an issue that is related to the core autoscaler and is not specific to any provider. kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants