Skip to content

Latest commit

 

History

History
59 lines (49 loc) · 2.77 KB

2022-09-14.md

File metadata and controls

59 lines (49 loc) · 2.77 KB

14 September 2022

or, EC2 "self"-hosted Linux arm64 runners for GitHub Actions

Rather than provisioning a specific, long-lived EC2 instance for this, I imagine using GitHub Actions itself to launch and register instances on demand. GitHub Actions supports ephemeral runners which deregister themselves and exit (or maybe vice versa?) after some period of inactivity. This seems perfect for a workflow job which launches a new instance when necessary to handle a job and lets the runner exit, deregister, and the instance terminate after some time. This sort of system could use a pool where up to N new instances are launched but after that the auto-launch is a no-op, or it could always launch a new instance specifically for use by the current workflow run.

In terms of implementation, the workflow would look something like this pseudo-code:

jobs:
  launch-runner:
    id: launch-runner
    uses: nextstrain/.github/.github/workflow/launch-runner.yaml
    with:
      os: linux
      machine: arm64
    outputs:
      labels: ...

  build:
    needs: launch-runner
    runs-on: ${{ fromJSON(needs.launch-runner.outputs.labels) }}
    steps:
      - ...

The labels output would either be the bare minimum required by GitHub like ["self-hosted", "linux", "ARM64"] for a pool model or ["self-hosted", "linux", "ARM64", "28e892c3-674b-4fc9-b7cb-813467d58a5c"] for a one-instance-per-workflow model.

The launch-runner shared workflow could also be a custom action. Not sure what's easier and a better fit.

It seems like ephemeral runners are only assigned a single job, so they'd only be appropriate for the one-instance-per-workflow model.

There are, of course, other auto-scaling solutions out there, but I wonder if they're good fits or not for our needs. If they are, great!

Using the webhooks mentioned there instead of launching explicitly in the workflow could be a useful abstraction barrier, but it might also be harder to manage than the more explicit coupling. In particular, it means having another logging/monitoring system for the actions taken on receiving a webhook, when we could instead have that integrated into GitHub Actions.

As a sort of middle ground between webhooks and the approach described above, we might be able to use workflow_run events to accomplish a similar thing while staying within the GitHub Actions ecosystem.