Skip to content

Commit

Permalink
feat: Add multi-runner capability (#2472)
Browse files Browse the repository at this point in the history
* feat: Remove support check_run (#2521)

* chore: Remove support check_run

* format, lint

* feat: Remove old scale down mechanism (< 0.19.0) (#2519)

fix: Remove old cleanup mechanism (< 0.19.0)

* feat: added changes for multi runner.

* fix: region.

* fix: more fixes.

* tuple to list.

* fixes.

* fixes.

* fixes.

* fixes.

* fixes.

* fixes.

* fix: formatting.

* fix: formatting.

* fix: formatting.

* fix: moved some blocks outside runner config.

* fix: few more updates

* fix: liniting.

* fix: updated example output

* changed runner group name.

* fix: updated the tests.

* fix: addressed review comments.

* fix: linting issues.

* fix: formatting.

* fix: updated tf version.

* fix: Remove removed prerelease option

* Add ubuntu runner to example

* refactor: use each instead of count

* fix: few small issues.

* refactor: syncer to count for multi runner

* fix: comments.

* fix: added Readme.

* fix: errors.

* move variable to runner config

* fix: updated the readme.

* Add todos

* feat: added windows runner configuration, completed todos and added the weight for runner config matchers.

* chore: Update docs

* fix: reverted tf versions.

* fix: addressed comments.

* fix: missed.

* fix: formatting.

* Update terraform versions in CI

* Update terraform versions in CI

* Update docs

* fix: coverage.

* Update docs

* improve test coverage webhook

* Apply suggestions from code review

* fix: formatting.

* fix: fixed merge issues.

* fix: syntax.

Co-authored-by: Niek Palm <npalm@users.noreply.github.com>
Co-authored-by: Niek Palm <niek.palm@philips.com>
Co-authored-by: navdeepg2021 <navdeepg2021@gmail.com>
  • Loading branch information
4 people authored Oct 19, 2022
1 parent a001003 commit c08b335
Show file tree
Hide file tree
Showing 42 changed files with 1,827 additions and 283 deletions.
35 changes: 31 additions & 4 deletions .github/workflows/terraform.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
name: Verify module
strategy:
matrix:
terraform: [1.1.3, "latest"]
terraform: [1.3.2, "latest"]
runs-on: ubuntu-latest
container:
image: hashicorp/terraform:${{ matrix.terraform }}
Expand All @@ -29,7 +29,7 @@ jobs:
touch modules/runner-binaries-syncer/lambdas/runner-binaries-syncer/runner-binaries-syncer.zip
- name: terraform init
run: terraform init -get -backend=false -input=false
- if: contains(matrix.terraform, '1.1.')
- if: contains(matrix.terraform, '1.3.')
name: check terraform formatting
run: terraform fmt -recursive -check=true -write=false
- if: contains(matrix.terraform, 'latest') # check formatting for the latest release but avoid failing the build
Expand All @@ -44,7 +44,7 @@ jobs:
strategy:
fail-fast: false
matrix:
terraform: [1.0.11, 1.1.3, "latest"]
terraform: [1.0.11, 1.1.9, 1.2.9, "latest"]
example:
["default", "ubuntu", "prebuilt", "arm64", "ephemeral", "windows"]
defaults:
Expand All @@ -57,7 +57,7 @@ jobs:
- uses: actions/checkout@v3
- name: terraform init
run: terraform init -get -backend=false -input=false
- if: contains(matrix.terraform, '1.1.')
- if: contains(matrix.terraform, '1.3.')
name: check terraform formatting
run: terraform fmt -recursive -check=true -write=false
- if: contains(matrix.terraform, 'latest') # check formatting for the latest release but avoid failing the build
Expand All @@ -66,3 +66,30 @@ jobs:
continue-on-error: true
- name: validate terraform011
run: terraform validate


verify_multi_runner_example:
name: Verify Multi-Runner examples
strategy:
fail-fast: false
matrix:
terraform: [1.3.2, "latest"]
defaults:
run:
working-directory: examples/multi-runner
runs-on: ubuntu-latest
container:
image: hashicorp/terraform:${{ matrix.terraform }}
steps:
- uses: actions/checkout@v3
- name: terraform init
run: terraform init -get -backend=false -input=false
- if: contains(matrix.terraform, '1.3.')
name: check terraform formatting
run: terraform fmt -recursive -check=true -write=false
- if: contains(matrix.terraform, 'latest') # check formatting for the latest release but avoid failing the build
name: check terraform formatting
run: terraform fmt -recursive -check=true -write=false
continue-on-error: true
- name: validate terraform
run: terraform validate
27 changes: 11 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ This [Terraform](https://www.terraform.io/) module creates the required infrastr
- [Motivation](#motivation)
- [Overview](#overview)
- [Major configuration options.](#major-configuration-options)
- [ARM64 support via Graviton/Graviton2 instance-types](#arm64-support-via-gravitongraviton2-instance-types)
- [Usages](#usages)
- [Setup GitHub App (part 1)](#setup-github-app-part-1)
- [Setup terraform module](#setup-terraform-module)
Expand All @@ -25,7 +24,6 @@ This [Terraform](https://www.terraform.io/) module creates the required infrastr
- [Experimental - Optional queue to publish GitHub workflow job events](#experimental---optional-queue-to-publish-github-workflow-job-events)
- [Examples](#examples)
- [Sub modules](#sub-modules)
- [ARM64 configuration for submodules](#arm64-configuration-for-submodules)
- [Debugging](#debugging)
- [Security Consideration](#security-consideration)
- [Requirements](#requirements)
Expand Down Expand Up @@ -81,16 +79,13 @@ Besides these permissions, the lambdas also need permission to CloudWatch (for l
To be able to support a number of use-cases the module has quite a lot of configuration options. We try to choose reasonable defaults. The several examples also show for the main cases how to configure the runners.

- Org vs Repo level. You can configure the module to connect the runners in GitHub on an org level and share the runners in your org. Or set the runners on repo level and the module will install the runner to the repo. There can be multiple repos but runners are not shared between repos.
- Checkrun vs Workflow job event. You can configure the webhook in GitHub to send checkrun or workflow job events to the webhook. Workflow job events are introduced by GitHub in September 2021 and are designed to support scalable runners. We advise when possible using the workflow job event, you can set `runner_enable_workflow_job_labels_check = true` to let the webhook only accept jobs based on the labels configured. The webhook will check the custom labels provided via the variable `runner_extra_labels` and the GitHub managed labels, "self-hosted", OS and architecture. The OS and architecture are derived from the settings. By default the check is disabled.
- Multi-Runner module. This modules allows to create multiple runner configurations with a single webhook and single GitHub App to simply deployment of different types of runners. Refer to the [ReadMe](.modules/../modules/multi-runner/README.md) for more information to understand the functionality.
- Workflow job event. You can configure the webhook in GitHub to send workflow job events to the webhook. Workflow job events are introduced by GitHub in September 2021 and are designed to support scalable runners. We advise when possible using the workflow job event.
- Linux vs Windows. you can configure the OS types linux and win. Linux will be used by default.
- Re-use vs Ephemeral. By default runners are re-used for till detected idle. Once idle they will be removed from the pool. To improve security we are introducing ephemeral runners. Those runners are only used for one job. Ephemeral runners are only working in combination with the workflow job event. We also suggest using a pre-build AMI to improve the start time of jobs.
- GitHub Cloud vs GitHub Enterprise Server (GHES). The runner support GitHub Cloud as well GitHub Enterprise Server. For GHES we rely on our community to test and support. We have no possibility to test ourselves on GHES.
- Spot vs on-demand. The runners use either the EC2 spot or on-demand life cycle. Runners will be created via the AWS [CreateFleet API](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_CreateFleet.html). The module (scale up lambda) will request via the CreateFleet API to create instances in one of the subnets and of the specified instance types.


#### ARM64 support via Graviton/Graviton2 instance-types

When using the default example or top-level module, specifying `instance_types` that match a Graviton/Graviton 2 (ARM64) architecture (e.g. a1, t4g or any 6th-gen `g` or `gd` type), you must also specify `runner_architecture = "arm64"` and the sub-modules will be automatically configured to provision with ARM64 AMIs and leverage GitHub's ARM64 action runner. See below for more details.
- ARM64 support via Graviton/Graviton2 instance-types. When using the default example or top-level module, specifying `instance_types` that match a Graviton/Graviton 2 (ARM64) architecture (e.g. a1, t4g or any 6th-gen `g` or `gd` type), you must also specify `runner_architecture = "arm64"` and the sub-modules will be automatically configured to provision with ARM64 AMIs and leverage GitHub's ARM64 action runner. See below for more details.

## Usages

Expand Down Expand Up @@ -334,11 +329,13 @@ Examples are located in the [examples](./examples) directory. The following exam

- _[Default](examples/default/README.md)_: The default example of the module
- _[ARM64](examples/arm64/README.md)_: Example usage with ARM64 architecture
- _[Ubuntu](examples/ubuntu/README.md)_: Example usage of creating a runner using Ubuntu AMIs.
- _[Windows](examples/windows/README.md)_: Example usage of creating a runner using Windows as the OS.
- _[Ephemeral](examples/ephemeral/README.md)_: Example usages of ephemeral runners based on the default example.
- _[Prebuilt Images](examples/prebuilt/README.md)_: Example usages of deploying runners with a custom prebuilt image.
- _[Multi Runner](examples/multi-runner/README.md)_ : Example usage of creating a multi runner which creates multiple runners/ configurations with a single deployment
- _[Permissions boundary](examples/permissions-boundary/README.md)_: Example usages of permissions boundaries.
- _[Prebuilt Images](examples/prebuilt/README.md)_: Example usages of deploying runners with a custom prebuilt image.
- _[Ubuntu](examples/ubuntu/README.md)_: Example usage of creating a runner using Ubuntu AMIs.
- _[Windows](examples/windows/README.md)_: Example usage of creating a runner using Windows as the OS.


## Sub modules

Expand All @@ -349,15 +346,14 @@ The following submodules are the core of the module and are mandatory:
- _[runner-binaries-syncer](./modules/runner-binaries-syncer/README.md)_ - Syncs the action runner distribution.
- _[runners](./modules/runners/README.md)_ - Scales the action runners up and down
- _[webhook](./modules/webhook/README.md)_ - Handles GitHub webhooks
- _[multi-runner](./modules/multi-runner/README.md) - Creates multiple runner configurations in a single deployment

The following sub modules are optional and are provided as example or utility:

- _[download-lambda](./modules/download-lambda/README.md)_ - Utility module to download lambda artifacts from GitHub Release
- _[setup-iam-permissions](./modules/setup-iam-permissions/README.md)_ - Example module to setup permission boundaries

### ARM64 configuration for submodules

When using the top level module configure `runner_architecture = "arm64"` and ensure the list of `instance_types` matches. When not using the top-level, ensure these properties are set on the submodules.
ARM64 configuration for submodules. When using the top level module configure `runner_architecture = "arm64"` and ensure the list of `instance_types` matches. When not using the top-level, ensure these properties are set on the submodules.

## Debugging

Expand Down Expand Up @@ -484,8 +480,7 @@ We welcome any improvement to the standard module to make the default as secure
| <a name="input_runner_boot_time_in_minutes"></a> [runner\_boot\_time\_in\_minutes](#input\_runner\_boot\_time\_in\_minutes) | The minimum time for an EC2 runner to boot and register as a runner. | `number` | `5` | no |
| <a name="input_runner_ec2_tags"></a> [runner\_ec2\_tags](#input\_runner\_ec2\_tags) | Map of tags that will be added to the launch template instance tag specifications. | `map(string)` | `{}` | no |
| <a name="input_runner_egress_rules"></a> [runner\_egress\_rules](#input\_runner\_egress\_rules) | List of egress rules for the GitHub runner instances. | <pre>list(object({<br> cidr_blocks = list(string)<br> ipv6_cidr_blocks = list(string)<br> prefix_list_ids = list(string)<br> from_port = number<br> protocol = string<br> security_groups = list(string)<br> self = bool<br> to_port = number<br> description = string<br> }))</pre> | <pre>[<br> {<br> "cidr_blocks": [<br> "0.0.0.0/0"<br> ],<br> "description": null,<br> "from_port": 0,<br> "ipv6_cidr_blocks": [<br> "::/0"<br> ],<br> "prefix_list_ids": null,<br> "protocol": "-1",<br> "security_groups": null,<br> "self": null,<br> "to_port": 0<br> }<br>]</pre> | no |
| <a name="input_runner_enable_workflow_job_labels_check"></a> [runner\_enable\_workflow\_job\_labels\_check](#input\_runner\_enable\_workflow\_job\_labels\_check) | If set to true all labels in the workflow job even are matched against the custom labels and GitHub labels (os, architecture and `self-hosted`). When the labels are not matching the event is dropped at the webhook. | `bool` | `false` | no |
| <a name="input_runner_enable_workflow_job_labels_check_all"></a> [runner\_enable\_workflow\_job\_labels\_check\_all](#input\_runner\_enable\_workflow\_job\_labels\_check\_all) | If set to true all labels in the workflow job must match the GitHub labels (os, architecture and `self-hosted`). When false if __any__ label matches it will trigger the webhook. `runner_enable_workflow_job_labels_check` must be true for this to take effect. | `bool` | `true` | no |
| <a name="input_runner_enable_workflow_job_labels_check_all"></a> [runner\_enable\_workflow\_job\_labels\_check\_all](#input\_runner\_enable\_workflow\_job\_labels\_check\_all) | If set to true all labels in the workflow job must match the GitHub labels (os, architecture and `self-hosted`). When false if __any__ label matches it will trigger the webhook. | `bool` | `true` | no |
| <a name="input_runner_extra_labels"></a> [runner\_extra\_labels](#input\_runner\_extra\_labels) | Extra (custom) labels for the runners (GitHub). Separate each label by a comma. Labels checks on the webhook can be enforced by setting `enable_workflow_job_labels_check`. GitHub read-only labels should not be provided. | `string` | `""` | no |
| <a name="input_runner_group_name"></a> [runner\_group\_name](#input\_runner\_group\_name) | Name of the runner group. | `string` | `"Default"` | no |
| <a name="input_runner_iam_role_managed_policy_arns"></a> [runner\_iam\_role\_managed\_policy\_arns](#input\_runner\_iam\_role\_managed\_policy\_arns) | Attach AWS or customer-managed IAM policies (by ARN) to the runner IAM role | `list(string)` | `[]` | no |
Expand Down
3 changes: 0 additions & 3 deletions examples/ephemeral/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,6 @@ module "runners" {
enable_organization_runners = true
runner_extra_labels = "default,example"

# enable workflow labels check
# runner_enable_workflow_job_labels_check = true

# enable access to the runners via SSM
enable_ssm_on_runners = true

Expand Down
60 changes: 60 additions & 0 deletions examples/multi-runner/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

48 changes: 48 additions & 0 deletions examples/multi-runner/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Action runners deployment of Multiple-Runner-Configurations-Together example

This module shows how to create GitHub action runners with multiple runner configuration together in one deployment. This example has the configurations for the following runner types with the relevant labels supported by them as matchers:

- Linux ARM64 `["self-hosted", "linux", "arm64", "amazon"]`
- Linux Ubuntu `["self-hosted", "linux", "x64", "ubuntu"]`
- Linux X64 `["self-hosted", "linux", "x64", "amazon"]`
- Windows X64 `["self-hosted", "windows", "x64", "servercore-2022"]`

The module will decide the runner for the workflow job based on the match in the labels defined in the workflow job and runner configuration. Also the runner configuration allows the match to be exact or non-exact match. We recommend to use only exact matches.

For exact match, all the labels defined in the workflow should be present in the runner configuration matchers and for non-exact match, some of the labels in the workflow, when present in runner configuration, shall be enough for the runner configuration to be used for the job. First the exact matchers are applied, next the non exact ones.

## Webhook

For the list of provided runner configurations, there will be a single webhook and only a single Github App to receive the notifications for all types of workflow triggers.

## Lambda distribution

Per combination of OS and architecture a lambda distribution syncer will be created. For this example there will be three instances (windows X64, linux X64, linux ARM).

## Usages

Steps for the full setup, such as creating a GitHub app can be found in the root module's [README](../../README.md). First download the Lambda releases from GitHub. Alternatively you can build the lambdas locally with Node or Docker, there is a simple build script in `<root>/.ci/build.sh`. In the `main.tf` you can simply remove the location of the lambda zip files, the default location will work in this case.

> Ensure you have set the version in `lambdas-download/main.tf` for running the example. The version needs to be set to a GitHub release version, see https://github.com/philips-labs/terraform-aws-github-runner/releases
```bash
cd lambdas-download
terraform init
terraform apply
cd ..
```

Before running Terraform, ensure the GitHub app is configured. See the [configuration details](../../README.md#usages) for more details.

```bash
terraform init
terraform apply
```

You can receive the webhook details by running:

```bash
terraform output -raw webhook_secret
```

Be-aware some shells will print some end of line character `%`.
25 changes: 25 additions & 0 deletions examples/multi-runner/lambdas-download/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
locals {
version = "<REPLACE_BY_GITHUB_RELEASE_VERSION>"
}

module "lambdas" {
source = "../../../modules/download-lambda"
lambdas = [
{
name = "webhook"
tag = local.version
},
{
name = "runners"
tag = local.version
},
{
name = "runner-binaries-syncer"
tag = local.version
}
]
}

output "files" {
value = module.lambdas.files
}
Loading

0 comments on commit c08b335

Please sign in to comment.