-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job-based Service Bus Scaler scales to too many instances #4554
Comments
pollingInterval should have no relationship to the count of Pods that are already running. If I have 2 Pods already running and 5 Messages in the Queue, then I need the scale-out to fire up only 3 new Pods. |
Could you enable the debug logs and share them? The operator logs in debug expose the queue length and the current job count |
Please instruct on how should I go about enabling the debug logs. More than gladly to do so.
Thank you, Eugen
Diese Nachricht wurde von meinem iPhone gesendet.
Am 5/24/23 um 2:58 AM schrieb Jorge Turrado Ferrero ***@***.***>:
Could you enable the debug logs and share them? The operator logs in debug expose the queue length and the current job count
—
Reply to this email directly, view it on GitHub<#4554 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADT64RVVP7TYNZR5GOGP37TXHXLULANCNFSM6AAAAAAYIKM35Q>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
I have the bandwidth now to address this issue. What would you like me to do precisely, perhaps the steps below?
The behavior I'd expect to have is that if I already have a Job running and I send a Message into the Queue, there won't be a second Job starting up but have the currently running Job handling that one Message. |
I think that this shouldn't happen.
This is exactly the behavior I'd expect. Isn't this happening? |
@JorTurFer No it does not happen. If have on Pod running - as per the minReplicaCount setting - and then I send a Message I see the second Pod starting up. I've tried it as well with minReplicatCount set to 2 and sending 5 Messages. The end result was that I got 7 Pods running whereas only 5 would have been sufficient to process the 5 Messages. |
@zroubalik , @tomkerkhove , Is this behavior intended and I'm missing something or is this a bug? I have checked the e2e tests and it's coverting this scenario |
I thought about this a bit more and it may rather be a feature than a bug. Let's say that I configure a ScaledJob to have a minReplicaCount of 4. By this I express my desire to always have 4 Jobs on stand-by, ready to receive Messages. 2 Messages pop up, so two of my initial 4 Jobs are busy processing them, and by doing so those two are no longer available. In response to that, the ScaledJob starts up two new Jobs immediately, in order to ensure that 4 Jobs will be available soon. Does this reasoning sounds right to you guys? |
I thought so, that's why I asked other teammates because that's the behavior covered by the e2e tests. Maybe it's just a documentation gap, but I'm not sure |
Thank you. Let's see what response we'll receive. However, since there are tests that test the behavior, it may be safe to update the documentation. And the behavior is indeed present, I'd tested it several times in the past two weeks and it does work very well :-)) |
If you set minReplicas for ScaledJob, then it is basically a minimum number of jobs (a base) anything else should trigger more jobs. see the PR: #3426 |
Thanks very much @zroubalik! Would it be possible to enhance the documentation of minReplicaCount at https://keda.sh/docs/2.9/concepts/scaling-jobs/ to explain the scale-out behavior dictated by the minReplicaCount parameter? In the current state of the documentation it explains only the fact that minReplicaCount Jobs will be created by default. |
It'd be amazing because it's true that it could be a bit confusing. Would you open a PR in docs with the change? |
I'll give it a try. My first open source contribution... |
It's never too late to start 😄 Just fork the docs repo, create a new branch and add the information and submit the PR. You might take some info or diagrams from the PR/issue I linked. If you find that useful. |
Done: kedacore/keda-docs#1144 |
@JorTurFer @zroubalik we were just reading the docs kindly added by @eugen-nw and this really confused me. I can understand that someone may want this behaviour, but it feels like the expected behaviour here:
is going to be a more common use case, or at least desired by some users. Scaling out too much will cost us a considerable amount of money as we're processing videos on GPU Nodes. |
You can limit the max. desired / allowed count of containers in the .yaml script. That will limit your expenses. In your example you will get 5 Jobs created to handle your 5 Messages + 2 other Jobs on stand-by to handle whatever may come in. All of these when the 5 new Pods are up and functional. My scale-out scenario has to accommodate sudden bursts in demand. The current operation mode enables me to have N containers (more or less) ready to immediately handle a burst. |
No matter what we set the max to we're always going to be spinning up containers for no reason. If two items come into our queue we don't need to spin up two additional Jobs with their own GPU Nodes and pay the minimum charge for that when we have two Jobs ready for them. If we set the maximum to the same as the minimum this wouldn't happen but we also would not be autoscaling.
I understand that this is a desirable use case for you and some others, but I doubt it's what most people would think the behaviour is when they see this parameter (which is why this issue was created). |
Hi @LewisJackson1 If waiting is not a problem and you prefer to save as much money as possible, you can set |
Hello @JorTurFer, I'm not sure that I understand the question here, apologies!
Yeah, if additional jobs came in after the minimum replicas then they would have to wait for scaling and that's acceptable. I guess the simplest way that I can think of to illustrate this is to compare the behaviour to a ScaledObject. If we configure a ScaledObject to track an SQS queue with 2 minimum replicas and 2 items enter the queue, the ScaledObject does not spin up 2 more Pods - is that correct? We're looking at migrating a queue processor from ScaledObject to ScaledJob and I'm just finding this inconsistency between the two defined behaviours quite weird. I think that we could work around this with a static Deployment that would always be warm, then set the ScaledJob to track additional queue items? |
Yes, you are right and they aren't consistent, but they aren't comparable either IMHO. I mean, in ScaledObject, the workload can process multiple items, so just after finishing with a message, the workload starts with the next message without any cooldown. In ScaledJob, your job usually takes 1 single message and ends, so after finishing the current message, the pod finishes and KEDA spin up another job, which isn't instant. That's why the minimum replicas for ScaledJob is the minimum replicas ready to work (idle). This is an interesting discussion, and maybe the best place is in a GH discussion, where other maintainers and any other community folk can give their 2 cents. Would you open a discussion about this? In any case, for solving your use case, you could create your REST API (or gRPC Server) with the business logic that you want, and use Metrics API Scaler (or External Scaler) to connect KEDA to it. With this approach, you could set |
You may want to give the Job scale-out method some time to settle. Spend some time experimenting with both scale-out alternatives. Use Linux containers (vs. Windows) for faster Pod start-up times. Jobs will always handle totally long processings should that be a concern. With ScaledObject scale-out you'll pay for unused capacity. The best scenario is to have no Pods running 24 x 7 and use ScaledJob to fire up Pods whenever necessary; should that setup accommodate your use cases. I operate in Azure cloud. Taking scale-out to the next level, I run no Pods in the Azure Kubernetes cluster but delegate them to run in the Azure Container Instances service by using a Virtual Kubelet. Thus we pay only for each second a Pod runs + we can scale out indefinitely. |
I've opened a discussion here: #4885
I feel like it is quite an opinionated stance for the scaler to take to assume that the user would want to have a buffer here as their Jobs are slow to start-up/terminate. I don't think there's that much difference between a Job and a persistent Pod, they both have a start-up latency so the over-provisioning behaviour could also be useful there. I can understand that this might be desirable for some people, and it'd be great to have this behaviour available for both ScaledJob and ScaledObject as an opt-in/out. |
Report
Say that I configure KEDA with minReplicaCount > 0. If I send Messages to the Queue, that causes KEDA to create as many new Pods as how many Messages there are in the Queue, with no regard to the count of Jobs that are always running, i.e. those created by the minReplicaCount > 0,
Expected Behavior
Let's say that I configure KEDA to have 2 Jobs running permanently. If I send 5 Messages to the Queue, I'd expect KEDA to create only 3 new Pods Instead it is creating 5 new Pods, so they match the count of Messages in the Queue. Below is the scaling behavior that the documentation at https://keda.sh/docs/2.9/concepts/scaling-jobs/ states.
Actual Behavior
Please see above.
Steps to Reproduce the Problem
Deploy the script and check the count of Pods created. Should be 2.
Send N Messages into the Queue.
Check the count of Pods created. It will be N + 2.
Logs from KEDA operator
Please email edaroczy@boldiq.com for the .ZIP file.
KEDA Version
2.10.1
Kubernetes Version
1.25
Platform
Microsoft Azure
Scaler Details
Azure Service Bus
Anything else?
AKS 1.25.6
KEDA 2.10.2
The Containers run on the virtual-node-aci-linux virtual node.
The text was updated successfully, but these errors were encountered: