Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide support for explicitly pausing autoscaling of workloads. #944

Closed
tomkerkhove opened this issue Jul 22, 2020 · 119 comments · Fixed by kedacore/keda-docs#728
Closed

Provide support for explicitly pausing autoscaling of workloads. #944

tomkerkhove opened this issue Jul 22, 2020 · 119 comments · Fixed by kedacore/keda-docs#728
Labels
feature-request All issues for new features that have not been committed to help wanted Looking for support from community needs-discussion

Comments

@tomkerkhove
Copy link
Member

tomkerkhove commented Jul 22, 2020

Provide support for explicitly stating workloads to scale to zero without the option of scaling up.

This can be useful for manually scaling-to-zero instances because:

  • You want to do maintenance
  • Your cluster is suffering from resource starvation and you want to remove non-mission-critical workloads

Why not delete the deployment? Glad you've asked! Because we don't want to touch the applications themselves but merely remove the instances it is running from an operational perspective. Once everything is good to go, we can enable it to scale again.

Suggestion

Introduce a new CRD, for example ManualScaleToZero, which targets a given deployment/workload and provides a description why it's scaled to 0 for now.

If scaled objects/jobs are configured, they are ignored in favor of the new CRD.

@tomkerkhove tomkerkhove added needs-discussion feature-request All issues for new features that have not been committed to labels Jul 22, 2020
@flecno
Copy link
Contributor

flecno commented Jul 22, 2020

I like the idea with a separate CRD especially because of the description field and to leave the ScaledObject untouched will fit better in my gitops use case with argocd autosync.

@galan
Copy link

galan commented Jul 23, 2020

I also support this idea and what @flecno said, however I would like to see a more generic CRD where you not can only scale to zero, but also enforce any value you like. Maybe call it ManualScale or ScaledObjectOverride.

This would help us in certain other seldom situations, where we have to process more independent of the triggers.

@tomkerkhove
Copy link
Member Author

The goal with this seperate CRD is to override the ScaledObject.

With the manual approach, what would be the scenario where you cannot just scale the deployment itself let's say?

@galan
Copy link

galan commented Jul 23, 2020

Scaling to zero or scaling to n, in both cases you enforce a value of fixed pods you require, where the autoscaling is not supposed to interfere.

Using kubectl scale <type> <name> --replicas=<n> doesn't works with zero, but also not for any other value submitted. Right after scaling to another value, the hpa controller kicks in and restores the old value.

Example ScaledObject:

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  labels:
    deploymentName: myapp
  name: myapp
  ...
spec:
  cooldownPeriod: 600
  maxReplicaCount: 8
  minReplicaCount: 0
  pollingInterval: 30
  scaleType: deployment
  triggers:
  ...

Current scale is 1, trying to scale to 3 with kubectl scale deployment myapp --replicas=3 will start 2 additional pods, but they are terminated immediately:

0s   Normal    ScalingReplicaSet    deployment/myapp                  Scaled up replica set myapp-5cbd86475b to 3
0s   Normal    SuccessfulCreate     replicaset/myapp-5cbd86475b       Created pod: myapp-5cbd86475b-wxjbj
0s   Normal    SuccessfulCreate     replicaset/myapp-5cbd86475b       Created pod: myapp-5cbd86475b-257k4
0s   Normal    SuccessfulRescale    horizontalpodautoscaler/myapp     New size: 1; reason: Current number of replicas above Spec.MaxReplicas
0s   Normal    ScalingReplicaSet    deployment/myapp                  Scaled down replica set myapp-5cbd86475b to 1
0s   Normal    SuccessfulDelete     replicaset/myapp-5cbd86475b       Deleted pod: myapp-5cbd86475b-257k4
0s   Normal    SuccessfulDelete     replicaset/myapp-5cbd86475b       Deleted pod: myapp-5cbd86475b-wxjbj

And the spec of the keda-generated hpa is not above MaxReplicas:

spec:
  maxReplicas: 8
  minReplicas: 1

So all I wanted to suggest was, that a more generic override with any value is appreciated (not only zero).

@bcorijn
Copy link

bcorijn commented Nov 18, 2020

Being able to suspend scaling during operations without removing or touching object state would be ideal. With our HPA setup, we could scale down the target object to 0, without touching any of the min/max parameters of the HPA, and allow the HPA to kick back in afterwards by scaling the target back up to 1.
With the current keda approach, we need to keep track of those min/max values while suspending scaling.

@tomkerkhove
Copy link
Member Author

Agreed, would you prefer a new CRD then for that or what would you expect from KEDA?

@bcorijn
Copy link

bcorijn commented Nov 18, 2020

@tomkerkhove a CRD would work for sure! It feels a bit elaborate compared to just setting an annotation/label on one of the involved objects, but it also leaves room for much more functionality I guess?

@zroubalik
Copy link
Member

zroubalik commented Nov 18, 2020

I am more inclined towards annotation/label based solution. it is more clean and we don't introduce a new set of resources.

Out of curiosity, what benefits do you see behind the CRD based solution?

@tomkerkhove
Copy link
Member Author

Oh I was just checking what the expectations were; either are fine for me but would be good if this got surfaced somehow if we can do kubectl get so so maybe we add it as a field instead?

@bcorijn
Copy link

bcorijn commented Nov 19, 2020

Exposing it in the get would be nice for sure. At first glance I thought the Active field was an indication of this feature, but turned out it serves a different meaning.
I guess the main question is: do you just want to be able to "suspend" keda scaling for a while and take manual control (with a HPA or just scaling the target object), or do you want to explicitly set a replica count through this new field/CRD. I personally don't really see the need for this last option, as it's just replicating native functionality.

@tomkerkhove
Copy link
Member Author

do you want to explicitly set a replica count through this new field/CRD. I personally don't really see the need for this last option, as it's just replicating native functionality.

This is already possible today by aligning min/max replica count

I guess the main question is: do you just want to be able to "suspend" keda scaling for a while and take manual control (with a HPA or just scaling the target object)

This is my current thinking actually to have a State field which has Autoscaling & Paused or so.

@bkruger99
Copy link

Ideally, it would be something where granular permission could be given in the event someone on-call can temporarily disable w/o touching real object w/ any git re-run / etc. But yeah. big +1 to this request.

@tomkerkhove
Copy link
Member Author

@zroubalik OK for you if we commit to this for our roadmap?

@zroubalik
Copy link
Member

@tomkerkhove yeah :)

@tomkerkhove tomkerkhove modified the milestone: v2.0 Jan 4, 2021
@Alexander-Bartosh
Copy link

Alexander-Bartosh commented Jan 12, 2021

Guys would implementing something like this handle this case ?
#1500
If you can define "Maintenance" in a scaler terms things will work automatically

@jeffhollan jeffhollan added the help wanted Looking for support from community label Jan 20, 2021
@tomkerkhove
Copy link
Member Author

Shall we aim for 2.3 for this @jeffhollan @zroubalik ?

@coderanger
Copy link
Contributor

This is my current thinking actually to have a State field which has Autoscaling & Paused or so.

As prior art, CronJob objects have a boolean field for paused.

@tomkerkhove
Copy link
Member Author

That's fine by me to have that or something similar.

@zroubalik
Copy link
Member

Fine by me :) I am still not sure whether we should just explicitly scale to 0 (then I am up for using the same property like CronJob does) or whether we should scale to some specific numbers of replica (this would require a different property)

@tomkerkhove
Copy link
Member Author

Based on what I've heard and seen I think we should consider it explicitly scale to X and "disable/pause" autoscaling without removing the ScaledObject.

Then you can do maintenance jobs or so and just put it back to autoscaling when done.

@derekperkins
Copy link

One feature that would be nice to have is a duration for the scaling override. It can be all too easy to scale things to 0 when something happens, and then somebody forgets to scale it back up. Being able to say "scale to 0, then return to normal in 4 hours" would solve that problem nicely.

@coderanger
Copy link
Contributor

@derekperkins That realllly doesn't work great in a convergent system :-/ At best you can do "disregard this config after timestamp X" but it's an inherently non-convergent behavior so the edge cases get gnarly.

@aryan9600
Copy link
Contributor

aryan9600 commented Apr 20, 2022

@zroubalik how so? is it because is-paused seems a more appropriate term for reporting rather than specifying intent?

@zroubalik
Copy link
Member

zroubalik commented Apr 20, 2022

Exactly. And second concern (though minor and just nitpicking): I don't like the complexity of the annotation, is it isPaused or is-paused or is_paused?

@tomkerkhove
Copy link
Member Author

Fine by me, it was just an idea

@tomkerkhove
Copy link
Member Author

As we're planning KEDA v2.7 next week I'll keep this open until we have ScaledJob support

@tomkerkhove
Copy link
Member Author

So to summarize: We will support ScaledJobs through the autoscaling.keda.sh/paused: true` but did we land on an agreement for ScaledObjects on this @JorTurFer @zroubalik?

Do we support it as well? Keeping in mind that this would be the order of application:

  1. autoscaling.keda.sh/paused-replicas - Stops autoscaling and puts to given amount
  2. autoscaling.keda.sh/is-paused - Stops autoscaling and just keeps on current count

@JorTurFer
Copy link
Member

TBH I prefer autoscaling.keda.sh/paused instead of autoscaling.keda.sh/is-paused, but I can live with it.
I agree with the rest. My doubt is if we should append the other annotation if only one 1 set.
I mean:

  • if only autoscaling.keda.sh/is-paused we could set autoscaling.keda.sh/paused-replicas: currentReplicas
  • if only autoscaling.keda.sh/paused-replicas: N we could set autoscaling.keda.sh/is-paused.

Using this approach, we'll be consistent using both annotations always (to avoid logic problems if only 1 is set) and also we can keep the already existing behavior without any other change in how we manage the replicas only in the annotation manage itself. Also we will "correct" misconfiguration from our users, giving them the "correct" solution
WDYT?

@tomkerkhove
Copy link
Member Author

The risk you have there is this:

  1. End-user adds autoscaling.keda.sh/paused-replicas: N
  2. KEDA automatically adds autoscaling.keda.sh/paused
  3. End-user wants to scale again, so removes autoscaling.keda.sh/paused-replicas: N
  4. KEDA still doesn't scale because autoscaling.keda.sh/paused is there

We could, in theory, automatically remove autoscaling.keda.sh/paused`` when autoscaling.keda.sh/paused-repicas` is removed, but the question is if that wouldn't be confusing/dangerous?

@JorTurFer
Copy link
Member

you are right, but if we don't add the extra needed annotation, I'd require both for doing anything, I mean, even thought we could work using only autoscaling.keda.sh/paused-replicas in current cases, I'd require both, to be consistent and not having more than 1 way for pause the autoscaling

@tomkerkhove
Copy link
Member Author

I'm not sure I understand what you mean, can you elaborate please?

@JorTurFer
Copy link
Member

Sure,
Right now, we support autoscaling.keda.sh/paused-replicas in ScaledObject and we are talking about introducing autoscaling.keda.sh/paused in ScaledJob and also in ScaledObjectto be consistent between both CRDs.
With this new annotation, ScaledObject will have 2 different annotations that can be used, one for pausing the scaling with current replicas, and another one for pausing the autoscaling with a specific amount of instances.

My proposal is just requiring both in case of ScaledObjects instead of only one of them to be clearest as possible, instead of having 2 different annotations that can be used for different things, just requiring both every time, autoscaling.keda.sh/paused and autoscaling.keda.sh/paused-replicas, for pausing the ScaledObject.

This could simplify the logic in the ScaledObject (only one scenario instead of 2) and doesn't require any other action from our side like in my previous comment

@tomkerkhove
Copy link
Member Author

My proposal is just requiring both in case of ScaledObjects instead of only one of them to be clearest as possible, instead of having 2 different annotations that can be used for different things, just requiring both every time, autoscaling.keda.sh/paused and autoscaling.keda.sh/paused-replicas, for pausing the ScaledObject.

I don't think that really makes sense nor adds value:

  • In the case of pausing with fixed replicas, we can add the new annotation but it doesn't provide much added value other than consistency at the cost of duplication. Also, what do you do if only 1 of them is specified?
  • In the case of pausing without wanting to specify a replica count, well this would no longer be possible given the above proposal requires both annotations so the use-case we want to fix is not fixed

The logic around this should be very straight-forward and is fairly simple IMO:

  • If autoscaling.keda.sh/paused-replicas is specified, stop autoscaling and puts to given amount
  • Otherwise, if autoscaling.keda.sh/is-paused is there, just stop autoscaling and keep it at the current instance count
  • Otherwise, keep scaling 🚀🚀🚀

Logic is straight-forward and easy to document.

@JorTurFer
Copy link
Member

okey, we can start with that and iterate if we find problems

@tomkerkhove
Copy link
Member Author

The logic around this should be very straight-forward and is fairly simple IMO:

  • If autoscaling.keda.sh/paused-replicas is specified, stop autoscaling and puts to given amount
  • Otherwise, if autoscaling.keda.sh/is-paused is there, just stop autoscaling and keep it at the current instance count
  • Otherwise, keep scaling 🚀🚀🚀

Logic is straight-forward and easy to document.

So we'll start with this approach then? If so, I'll create a new issue for support in scaled jobs and one for this new flavor in scaled object.

@DanInProgress
Copy link

@tomkerkhove I have some spare cycles this week and could take on implementation for ScaledJob if you've not started already.

Currently in need of the ability to pause/prevent execution of Cron triggered ScaledJobs and working on a pull request seems less kludgy than any other workaround we've discussed.

@tomkerkhove
Copy link
Member Author

As far as I know, that's not been implemented yet so feel free to take a stab at it!

@arnoldyahad
Copy link

@tomkerkhove @DanInProgress
Hey guys,
We are looking to integrate KEDA at our organization.
do you know if its possible at the moment to specify: autoscaling.keda.sh/paused-replicas: currentReplicas ?
(using currentReplicas as a string, not specifying actual currentReplicas)
if not - is it something that is going to be implemented in this issue or should i open a new issue for that feature?

@tomkerkhove
Copy link
Member Author

So you want to basically pause autoscaling, without knowing what the current replica count is then, @arnoldyahad?

@arnoldyahad
Copy link

Thanks for the very quick reply @tomkerkhove
Yes exactly, We would like to pause autoscaling as is, without specifying current replica count.

our use-case right now is with spot.io and EC2 instances, where we can just press "pause autoscaling" in spot.io. and it pauses all autoscaling activities immediately. This is important to us as when production goes down - cpu drops to 0% and we begin a massive scale down, so we have API to spot.io to suspend the autoscaling.

we would like our devs to be able to respond quick enough to such incidents, and just putting an annotation like autoscaling.keda.sh/paused-replicas: currentReplica or autoscaling.keda.sh/paused-replicas: pause.

@JorTurFer
Copy link
Member

can't they just get the current replicas and set them? I mean, they can just run

kubectl get deploy {DEPLOYMENT_NAME} -n {NAMESPACE}-o=jsonpath='{.status.replicas}'

and get the value from it, if they can annotate the SO I assume that they can get the current replica count

@tomkerkhove
Copy link
Member Author

As I've mentioned before - while specifying a paused replica count is useful in some scenarios you, in other scenarios you don't really care what the current instance count is and just want to pause where you are today.

Using the command above helps with that, but is an additional step that we can/should avoid to make @arnoldyahad and others their life easier.

Our tag line is to make app automatically simpler, but in this case it's friction we can remove easily.

@tomkerkhove
Copy link
Member Author

The logic around this should be very straight-forward and is fairly simple IMO:

  • If autoscaling.keda.sh/paused-replicas is specified, stop autoscaling and puts to given amount
  • Otherwise, if autoscaling.keda.sh/is-paused is there, just stop autoscaling and keep it at the current instance count
  • Otherwise, keep scaling 🚀🚀🚀

Logic is straight-forward and easy to document.

So we'll start with this approach then? If so, I'll create a new issue for support in scaled jobs and one for this new flavor in scaled object.

This was what we landed on AFAIK but never got final "sign off" so did not create issues yet but would love to create them.

@arnoldyahad
Copy link

@JorTurFer Thanks for the comment, meanwhile we will indeed use a workaround.

If someone is interested:

replicas=$(kubectl get deploy <deployment>  -n <NS> -o=jsonpath='{.status.replicas}') | kubectl annotate scaledObject -n <NS> <scaledObject name>  "autoscaling.keda.sh/paused-replicas=$replicas" 

@tomkerkhove
Copy link
Member Author

Created #3303 & #3304.

Repository owner moved this from Done to Ready To Ship in Roadmap - KEDA Core Jun 30, 2022
@tomkerkhove tomkerkhove moved this from Ready To Ship to Done in Roadmap - KEDA Core Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request All issues for new features that have not been committed to help wanted Looking for support from community needs-discussion
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.