From 7bc08fbe7ce9c4d7d819824ea19c29372a98dcad Mon Sep 17 00:00:00 2001 From: toast-gear Date: Thu, 28 Apr 2022 10:36:12 +0100 Subject: [PATCH 1/9] docs: remove TotalNumberOfQueuedAndInProgressWorkflowRuns limitation --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 4d725aa890..bcb8fa87c6 100644 --- a/README.md +++ b/README.md @@ -567,7 +567,6 @@ The `TotalNumberOfQueuedAndInProgressWorkflowRuns` metric polls GitHub for all p 1. A list of repositories must be included within the scaling metric. Maintaining a list of repositories may not be viable in larger environments or self-serve environments. 2. May not scale quickly enough for some users' needs. This metric is pull based and so the queue depth is polled as configured by the sync period, as a result scaling performance is bound by this sync period meaning there is a lag to scaling activity. 3. Relatively large amounts of API requests are required to maintain this metric, you may run into API rate limit issues depending on the size of your environment and how aggressive your sync period configuration is. -4. The GitHub API doesn't provide a way to filter workflow jobs to just those targeting self-hosted runners. If your environment's workflows target both self-hosted and GitHub-hosted runners then the queue depth this metric scales against isn't a true 1:1 mapping of queue depth to the required runner count. As a result of this, this metric may scale too aggressively for your actual self-hosted runner count needs. Example `RunnerDeployment` backed by a `HorizontalRunnerAutoscaler`: From 61c5a112db52c5d834f362d55a26bd420ad8bfa7 Mon Sep 17 00:00:00 2001 From: toast-gear Date: Thu, 28 Apr 2022 10:39:11 +0100 Subject: [PATCH 2/9] docs: remove reference to cleared limitation --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index bcb8fa87c6..9bcaf81f24 100644 --- a/README.md +++ b/README.md @@ -560,7 +560,7 @@ The `TotalNumberOfQueuedAndInProgressWorkflowRuns` metric polls GitHub for all p **Benefits of this metric** 1. Supports named repositories allowing you to restrict the runner to a specified set of repositories server-side. -2. Scales the runner count based on the depth of the job queue meaning a more 1:1 scaling of runners to queued jobs (caveat, see drawback #4) +2. Scales the runner count based on the depth of the job queue meaning a 1:1 scaling of runners to queued jobs. 3. Like all scaling metrics, you can manage workflow allocation to the RunnerDeployment through the use of [GitHub labels](#runner-labels). **Drawbacks of this metric** From 6d10dd8e1dd12fd3e54c447d0a9790d5811e2e0e Mon Sep 17 00:00:00 2001 From: toast-gear Date: Thu, 28 Apr 2022 10:51:19 +0100 Subject: [PATCH 3/9] docs: breaking changes in v0.23.0 --- docs/releasenotes/0.23.md | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 docs/releasenotes/0.23.md diff --git a/docs/releasenotes/0.23.md b/docs/releasenotes/0.23.md new file mode 100644 index 0000000000..a46b0e2515 --- /dev/null +++ b/docs/releasenotes/0.23.md @@ -0,0 +1,34 @@ +# actions-runner-controller v0.23.0 + +https://github.com/actions-runner-controller/actions-runner-controller/milestone/3 + +# BREAKING CHANGE : Workflow job webhooks require an explicit field set + +Previously the webhook workflow job was set as the default if no `githubEvent` was set: + +```yaml + scaleUpTriggers: + - githubEvent: {} + duration: "30m" +``` + +You now need to set `workflowJob: {}` explicitly + +```yaml + scaleUpTriggers: + - githubEvent: + workflowJob: {} + duration: "30m" +``` + +# BREAKING CHANGE : topologySpreadConstraints renamed to topologySpreadConstraint + +Previously to use the pod `topologySpreadConstraint:` attribute in your runners you had to set `topologySpreadConstraints:` instead, this was a typo and has been corrected. + +# BREAKING CHANGE : Default sync period is now 1 minute instead of 10 minutes + +Since caching as been implemented the default sync period of 10 minutes is unnecessarily conservative and gives a poor out of the box user experience. If you need a 10 minute sync period ensure you explicitly set this value. + +# BREAKING CHANGE : A metric is set by default + +Previously is no metric was provided and you were using pull based scaling, `TotalNumberOfQueuedAndInProgressWorkflowRuns` was applied. No default is set now and the end user must always set this metric explicitly if they want to use it. From 70ae5aef1f9b383e0b99fc8aabcb6a19dd729783 Mon Sep 17 00:00:00 2001 From: toast-gear Date: Thu, 28 Apr 2022 15:57:03 +0100 Subject: [PATCH 4/9] docs: add migration steps --- docs/releasenotes/0.23.md | 51 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 48 insertions(+), 3 deletions(-) diff --git a/docs/releasenotes/0.23.md b/docs/releasenotes/0.23.md index a46b0e2515..a6d8cf869e 100644 --- a/docs/releasenotes/0.23.md +++ b/docs/releasenotes/0.23.md @@ -4,7 +4,11 @@ https://github.com/actions-runner-controller/actions-runner-controller/milestone # BREAKING CHANGE : Workflow job webhooks require an explicit field set -Previously the webhook workflow job was set as the default if no `githubEvent` was set: +Previously the webhook event workflow job was set as the default if no `githubEvent` was set. + +**Migration Steps** + +Change this: ```yaml scaleUpTriggers: @@ -12,7 +16,7 @@ Previously the webhook workflow job was set as the default if no `githubEvent` w duration: "30m" ``` -You now need to set `workflowJob: {}` explicitly +To this: ```yaml scaleUpTriggers: @@ -25,10 +29,51 @@ You now need to set `workflowJob: {}` explicitly Previously to use the pod `topologySpreadConstraint:` attribute in your runners you had to set `topologySpreadConstraints:` instead, this was a typo and has been corrected. +**Migration Steps** + +Update your runners to use `topologySpreadConstraints:` instead + # BREAKING CHANGE : Default sync period is now 1 minute instead of 10 minutes Since caching as been implemented the default sync period of 10 minutes is unnecessarily conservative and gives a poor out of the box user experience. If you need a 10 minute sync period ensure you explicitly set this value. +**Migration Steps** + +Update your sync period, how this is done will depend on how you've deployed ARC. # BREAKING CHANGE : A metric is set by default -Previously is no metric was provided and you were using pull based scaling, `TotalNumberOfQueuedAndInProgressWorkflowRuns` was applied. No default is set now and the end user must always set this metric explicitly if they want to use it. +Previously if no metric was provided and you were using pull based scaling the `TotalNumberOfQueuedAndInProgressWorkflowRuns` was metric applied. No default is set now. + +**Migration Steps** + +Add in the `TotalNumberOfQueuedAndInProgressWorkflowRuns` metric where you are currenty relying on it + +```yaml + +apiVersion: actions.summerwind.dev/v1alpha1 +kind: RunnerDeployment +metadata: + name: example-runner-deployment +spec: + template: + spec: + organisation: my-awesome-organisation + labels: + - my-awesome-runner +--- +apiVersion: actions.summerwind.dev/v1alpha1 +kind: HorizontalRunnerAutoscaler +metadata: + name: example-runner-deployment-autoscaler +spec: + scaleTargetRef: + name: example-runner-deployment + minReplicas: 1 + maxReplicas: 5 + metrics: + - type: TotalNumberOfQueuedAndInProgressWorkflowRuns + repositoryNames: + - owner/my-awesome-repo-1 + - owner/my-awesome-repo-2 + - owner/my-awesome-repo-3 +``` \ No newline at end of file From 832e59338eed3690d2ffa06ceab38938a655c713 Mon Sep 17 00:00:00 2001 From: toast-gear Date: Thu, 28 Apr 2022 16:00:03 +0100 Subject: [PATCH 5/9] docs: clarification of the release log --- docs/releasenotes/0.23.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/releasenotes/0.23.md b/docs/releasenotes/0.23.md index a6d8cf869e..b4d66b2e70 100644 --- a/docs/releasenotes/0.23.md +++ b/docs/releasenotes/0.23.md @@ -1,7 +1,8 @@ # actions-runner-controller v0.23.0 -https://github.com/actions-runner-controller/actions-runner-controller/milestone/3 +All changes in this release can be found in the milestone https://github.com/actions-runner-controller/actions-runner-controller/milestone/3 +This log documents breaking and major enhancements # BREAKING CHANGE : Workflow job webhooks require an explicit field set Previously the webhook event workflow job was set as the default if no `githubEvent` was set. From 46291c18235d2eaac5e0003377a2b7870c75cdc4 Mon Sep 17 00:00:00 2001 From: toast-gear Date: Thu, 28 Apr 2022 16:04:16 +0100 Subject: [PATCH 6/9] docs: highlight the new scale down delay flag --- README.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 9bcaf81f24..4af3f47ab5 100644 --- a/README.md +++ b/README.md @@ -489,9 +489,14 @@ A `RunnerDeployment` or `RunnerSet` can scale the number of runners between `min #### Anti-Flapping Configuration -For both pull driven or webhook driven scaling an anti-flapping implementation is included, by default a runner won't be scaled down within 10 minutes of it having been scaled up. This delay is configurable by including the attribute `scaleDownDelaySecondsAfterScaleOut:` in a `HorizontalRunnerAutoscaler` kind's `spec:`. +For both pull driven or webhook driven scaling an anti-flapping implementation is included, by default a runner won't be scaled down within 10 minutes of it having been scaled up. -This configuration has the final say on if a runner can be scaled down or not regardless of the chosen scaling method. Depending on your requirements, you may want to consider adjusting this by setting the `scaleDownDelaySecondsAfterScaleOut:` attribute. +This anti-flap configuration also has the final say on if a runner can be scaled down or not regardless of the chosen scaling method. + +This delay is configurable via 2 methods: + +1. By setting a new default via the controller's `--default-scale-down-delay` flag +2. By setting by setting the attribute `scaleDownDelaySecondsAfterScaleOut:` in a `HorizontalRunnerAutoscaler` kind's `spec:`. Below is a complete basic example of one of the pull driven scaling metrics. From 9ed429513daaaf18934f9ff4573e02ad2c56efd0 Mon Sep 17 00:00:00 2001 From: toast-gear Date: Thu, 28 Apr 2022 16:04:58 +0100 Subject: [PATCH 7/9] docs: bump the helm upgrade chart docs version --- charts/actions-runner-controller/docs/UPGRADING.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/charts/actions-runner-controller/docs/UPGRADING.md b/charts/actions-runner-controller/docs/UPGRADING.md index 9cd2220723..1b80b04669 100644 --- a/charts/actions-runner-controller/docs/UPGRADING.md +++ b/charts/actions-runner-controller/docs/UPGRADING.md @@ -22,7 +22,7 @@ Due to the above you can't just do a `helm upgrade` to release the latest versio ```shell # REMEMBER TO UPDATE THE CHART_VERSION TO RELEVANT CHART VERISON!!!! -CHART_VERSION=0.17.0 +CHART_VERSION=0.18.0 curl -L https://github.com/actions-runner-controller/actions-runner-controller/releases/download/actions-runner-controller-${CHART_VERSION}/actions-runner-controller-${CHART_VERSION}.tgz | tar zxv --strip 1 actions-runner-controller/crds From 78a0817c2c21cdec2f5dba741510929861444703 Mon Sep 17 00:00:00 2001 From: toast-gear Date: Thu, 28 Apr 2022 16:06:59 +0100 Subject: [PATCH 8/9] docs: align release doc format --- docs/releasenotes/0.23.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/releasenotes/0.23.md b/docs/releasenotes/0.23.md index b4d66b2e70..55b69978da 100644 --- a/docs/releasenotes/0.23.md +++ b/docs/releasenotes/0.23.md @@ -3,7 +3,7 @@ All changes in this release can be found in the milestone https://github.com/actions-runner-controller/actions-runner-controller/milestone/3 This log documents breaking and major enhancements -# BREAKING CHANGE : Workflow job webhooks require an explicit field set +## BREAKING CHANGE : Workflow job webhooks require an explicit field set Previously the webhook event workflow job was set as the default if no `githubEvent` was set. @@ -26,7 +26,7 @@ To this: duration: "30m" ``` -# BREAKING CHANGE : topologySpreadConstraints renamed to topologySpreadConstraint +## BREAKING CHANGE : topologySpreadConstraints renamed to topologySpreadConstraint Previously to use the pod `topologySpreadConstraint:` attribute in your runners you had to set `topologySpreadConstraints:` instead, this was a typo and has been corrected. @@ -34,14 +34,15 @@ Previously to use the pod `topologySpreadConstraint:` attribute in your runners Update your runners to use `topologySpreadConstraints:` instead -# BREAKING CHANGE : Default sync period is now 1 minute instead of 10 minutes +## BREAKING CHANGE : Default sync period is now 1 minute instead of 10 minutes Since caching as been implemented the default sync period of 10 minutes is unnecessarily conservative and gives a poor out of the box user experience. If you need a 10 minute sync period ensure you explicitly set this value. **Migration Steps** Update your sync period, how this is done will depend on how you've deployed ARC. -# BREAKING CHANGE : A metric is set by default + +## BREAKING CHANGE : A metric is set by default Previously if no metric was provided and you were using pull based scaling the `TotalNumberOfQueuedAndInProgressWorkflowRuns` was metric applied. No default is set now. From 58416db8c83aff2fd220c34987b65431a31dfa8b Mon Sep 17 00:00:00 2001 From: toast-gear Date: Thu, 28 Apr 2022 16:17:53 +0100 Subject: [PATCH 9/9] docs: add new runner group API enhancemnet --- docs/releasenotes/0.23.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/releasenotes/0.23.md b/docs/releasenotes/0.23.md index 55b69978da..2e974199e5 100644 --- a/docs/releasenotes/0.23.md +++ b/docs/releasenotes/0.23.md @@ -78,4 +78,12 @@ spec: - owner/my-awesome-repo-1 - owner/my-awesome-repo-2 - owner/my-awesome-repo-3 -``` \ No newline at end of file +``` + +## ENHANCEMENT : Find runner groups that visible to repository using a single API call + +GitHub has contributed code to utilise a new API to enable us to get a repositories runner groups with a single API call. This enables us to scale runners based on the requesting repositories runner group membership without a series of expensive API queries. + +This is an opt-in feature currently as it's a significant change in behaviour if enabled, additionally, whilst scaling based on the repositories runner group membership is supported in both GHES and github.com, only github.com currently has access to the new raate-limit budget friendly API. + +To enable this set deploy via Helm and set `githubWebhookServer.useRunnerGroupsVisibility` to `true`.