-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADR: Changing semantics of min runners to be min idle runners #3040
Conversation
|
||
With the "lazy" strategy, the current behavior of `minRunners` will be preserved. The `minRunners` field will specify the minimum number of runners running in a cluster regardless of their state ("running" or "idle"). | ||
|
||
With the "eager" strategy, `minRunners` will be treated as `minIdleRunners`. This strategy will calculate the number of runners based on the number of workflows acquired plus the number of minRunners. If the `maxRunners` field is specified, it will be respected, so the number of idle runners can be less than the number of idle `minRunners` when the number of acquired jobs plus the number of min runners is greater than `maxRunners`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we don't stop acquire jobs after we reaching maxRunners, would this cause we creating too many runners?
also since 2 runner scale sets can acquire the same job, but only one of them will get run, will we cause too much wasting resource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It should not because the patch from the listener should only be up to
maxRunners
regardless of how many are acquired - Here we can over-provision them if other scale set "steals" the job, but eventually we will scale down on the next message. The unlucky scenario here can be that no events are created after the job is stolen, so we may not scale down to the desired number of min runners. Good point...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might also want to fix the scale down process if we start doing this since it might make the extra scale up/down even worse.
|
||
## Context | ||
|
||
Current implementation treats the `minRunners` field as the number of runners that should be running on your cluster. They can be busy running the job, or they can be idle. This ensures faster cold startup time when workflows are acquired as well as trying to use the minimum amount of runners needed to fulfill the scaling requirement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think runner pod kind of having 3 stage:
- Running jobs
- Idle, runner online to the service
- Runner pod starting, runner is trying to be online to the service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I'll add it (I merged this case with the idle one as it is not busy, but I should have been more precise)
Hey everyone, The ADR has been changed not to introduce multiple scaling strategies, but rather to change the semantics of the field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's ship it
Proposing design for having two ways of handling
minRunners
fieldAddresses #2707