Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: final doc changes + v0.23.0 release notes #1396

Merged
merged 9 commits into from
Apr 29, 2022
12 changes: 8 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -489,9 +489,14 @@ A `RunnerDeployment` or `RunnerSet` can scale the number of runners between `min

#### Anti-Flapping Configuration

For both pull driven or webhook driven scaling an anti-flapping implementation is included, by default a runner won't be scaled down within 10 minutes of it having been scaled up. This delay is configurable by including the attribute `scaleDownDelaySecondsAfterScaleOut:` in a `HorizontalRunnerAutoscaler` kind's `spec:`.
For both pull driven or webhook driven scaling an anti-flapping implementation is included, by default a runner won't be scaled down within 10 minutes of it having been scaled up.

This configuration has the final say on if a runner can be scaled down or not regardless of the chosen scaling method. Depending on your requirements, you may want to consider adjusting this by setting the `scaleDownDelaySecondsAfterScaleOut:` attribute.
This anti-flap configuration also has the final say on if a runner can be scaled down or not regardless of the chosen scaling method.

This delay is configurable via 2 methods:

1. By setting a new default via the controller's `--default-scale-down-delay` flag
2. By setting by setting the attribute `scaleDownDelaySecondsAfterScaleOut:` in a `HorizontalRunnerAutoscaler` kind's `spec:`.

Below is a complete basic example of one of the pull driven scaling metrics.

Expand Down Expand Up @@ -560,14 +565,13 @@ The `TotalNumberOfQueuedAndInProgressWorkflowRuns` metric polls GitHub for all p

**Benefits of this metric**
1. Supports named repositories allowing you to restrict the runner to a specified set of repositories server-side.
2. Scales the runner count based on the depth of the job queue meaning a more 1:1 scaling of runners to queued jobs (caveat, see drawback #4)
2. Scales the runner count based on the depth of the job queue meaning a 1:1 scaling of runners to queued jobs.
3. Like all scaling metrics, you can manage workflow allocation to the RunnerDeployment through the use of [GitHub labels](#runner-labels).

**Drawbacks of this metric**
1. A list of repositories must be included within the scaling metric. Maintaining a list of repositories may not be viable in larger environments or self-serve environments.
2. May not scale quickly enough for some users' needs. This metric is pull based and so the queue depth is polled as configured by the sync period, as a result scaling performance is bound by this sync period meaning there is a lag to scaling activity.
3. Relatively large amounts of API requests are required to maintain this metric, you may run into API rate limit issues depending on the size of your environment and how aggressive your sync period configuration is.
4. The GitHub API doesn't provide a way to filter workflow jobs to just those targeting self-hosted runners. If your environment's workflows target both self-hosted and GitHub-hosted runners then the queue depth this metric scales against isn't a true 1:1 mapping of queue depth to the required runner count. As a result of this, this metric may scale too aggressively for your actual self-hosted runner count needs.

Example `RunnerDeployment` backed by a `HorizontalRunnerAutoscaler`:

Expand Down
2 changes: 1 addition & 1 deletion charts/actions-runner-controller/docs/UPGRADING.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Due to the above you can't just do a `helm upgrade` to release the latest versio

```shell
# REMEMBER TO UPDATE THE CHART_VERSION TO RELEVANT CHART VERISON!!!!
CHART_VERSION=0.17.0
CHART_VERSION=0.18.0

curl -L https://github.com/actions-runner-controller/actions-runner-controller/releases/download/actions-runner-controller-${CHART_VERSION}/actions-runner-controller-${CHART_VERSION}.tgz | tar zxv --strip 1 actions-runner-controller/crds

Expand Down
89 changes: 89 additions & 0 deletions docs/releasenotes/0.23.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# actions-runner-controller v0.23.0

All changes in this release can be found in the milestone https://github.com/actions-runner-controller/actions-runner-controller/milestone/3

This log documents breaking and major enhancements
## BREAKING CHANGE : Workflow job webhooks require an explicit field set

Previously the webhook event workflow job was set as the default if no `githubEvent` was set.

**Migration Steps**

Change this:

```yaml
scaleUpTriggers:
- githubEvent: {}
duration: "30m"
```

To this:

```yaml
scaleUpTriggers:
- githubEvent:
workflowJob: {}
duration: "30m"
```

## BREAKING CHANGE : topologySpreadConstraints renamed to topologySpreadConstraint

Previously to use the pod `topologySpreadConstraint:` attribute in your runners you had to set `topologySpreadConstraints:` instead, this was a typo and has been corrected.

**Migration Steps**

Update your runners to use `topologySpreadConstraints:` instead

## BREAKING CHANGE : Default sync period is now 1 minute instead of 10 minutes

Since caching as been implemented the default sync period of 10 minutes is unnecessarily conservative and gives a poor out of the box user experience. If you need a 10 minute sync period ensure you explicitly set this value.

**Migration Steps**

Update your sync period, how this is done will depend on how you've deployed ARC.

## BREAKING CHANGE : A metric is set by default

Previously if no metric was provided and you were using pull based scaling the `TotalNumberOfQueuedAndInProgressWorkflowRuns` was metric applied. No default is set now.

**Migration Steps**

Add in the `TotalNumberOfQueuedAndInProgressWorkflowRuns` metric where you are currenty relying on it

```yaml

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: example-runner-deployment
spec:
template:
spec:
organisation: my-awesome-organisation
labels:
- my-awesome-runner
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: example-runner-deployment-autoscaler
spec:
scaleTargetRef:
name: example-runner-deployment
minReplicas: 1
maxReplicas: 5
metrics:
- type: TotalNumberOfQueuedAndInProgressWorkflowRuns
repositoryNames:
- owner/my-awesome-repo-1
- owner/my-awesome-repo-2
- owner/my-awesome-repo-3
```

## ENHANCEMENT : Find runner groups that visible to repository using a single API call

GitHub has contributed code to utilise a new API to enable us to get a repositories runner groups with a single API call. This enables us to scale runners based on the requesting repositories runner group membership without a series of expensive API queries.

This is an opt-in feature currently as it's a significant change in behaviour if enabled, additionally, whilst scaling based on the repositories runner group membership is supported in both GHES and github.com, only github.com currently has access to the new raate-limit budget friendly API.
mumoshu marked this conversation as resolved.
Show resolved Hide resolved

To enable this set deploy via Helm and set `githubWebhookServer.useRunnerGroupsVisibility` to `true`.
mumoshu marked this conversation as resolved.
Show resolved Hide resolved