Summary
An overly broad IAM policy statement grants all EC2 runner instances unrestricted read access to GitHub runner registration tokens and JIT config stored in the AWS SSM Parameter Store. This misconfiguration permits any runner instance to access tokens intended for other runners, potentially compromising the integrity of workflows and the confidentiality of GitHub Secrets accessible to jobs. Refinement of the IAM policy assigned to EC2 runners is necessary to ensure strict, least privilege access controls.
Details
The infrastructure designed for self-hosted runners of GitHub Actions leverages a series of AWS Lambda functions to dynamically scale runners in response to GitHub Action events. It utilizes SQS for message passing between Lambdas and an SSM Parameter Store for secure distribution of configuration values, eliminating the need to directly share secrets of varying sensitivity across instances. This setup underpins robust isolation between runners and workflows. However, an issue arises with the policy associated with the EC2 runners' role, which inadvertently grants access to GitHub runner registration tokens and JIT configurations meant for other runners.
The following statement, part of the role policy assigned to EC2 runners, allows runners assigned to this role to read parameters under the ${arn_ssm_parameters_path_tokens}*
path of the parameter store.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssm:DeleteParameter",
"ssm:GetParameters",
"ssm:GetParameter"
],
"Resource": "${arn_ssm_parameters_path_tokens}*"
},
Unlike some role policies in this module, this one is attached unconditionally to "${var.prefix}-runner-role"
role:
resource "aws_iam_role_policy" "ssm_parameters" {
name = "runner-ssm-parameters"
role = aws_iam_role.runner.name
policy = templatefile("${path.module}/policies/instance-ssm-parameters-policy.json",
{
arn_ssm_parameters_path_tokens = "arn:${var.aws_partition}:ssm:${var.aws_region}:${data.aws_caller_identity.current.account_id}:parameter${var.ssm_paths.root}/${var.ssm_paths.tokens}"
arn_ssm_parameters_path_config = local.arn_ssm_parameters_path_config
}
)
}
The ${arn_ssm_parameters_path_tokens}
Terraform variable passed to the policy template file is configured based on the ssm_paths
variable passed to the module, considerating region and account id. As shown in the snippet above, the parameter path will be constructed using ssm_paths.root
and var.ssm_paths.tokens
:
arn_ssm_parameters_path_tokens = "arn:${var.aws_partition}:ssm:${var.aws_region}:${data.aws_caller_identity.current.account_id}:parameter${var.ssm_paths.root}/${var.ssm_paths.tokens}"
(Note: This path is slightly different for the multi-runner module. For simplicity, in this description, we will focus on the single runner type and describe the modified risk for multiple runner types later.)
The default values for these vars are passed to the runners child module by the root module, as seen in the snippets below:
//https://github.com/philips-labs/terraform-aws-github-runner/blob/main/variables.tf
variable "ssm_paths" {
description = "The root path used in SSM to store configuration and secrets."
type = object({
root = optional(string, "github-action-runners")
app = optional(string, "app")
runners = optional(string, "runners")
webhook = optional(string, "webhook")
use_prefix = optional(bool, true)
})
default = {}
}
ssm_root_path = var.ssm_paths.use_prefix ? "/${var.ssm_paths.root}/${var.prefix}" : "/${var.ssm_paths.root}"
(...)
ssm_paths = {
root = local.ssm_root_path
tokens = "${var.ssm_paths.runners}/tokens"
config = "${var.ssm_paths.runners}/config"
}
So for the default variable values the IAM policy statement will look like:
{
"Action": [
"ssm:DeleteParameter",
"ssm:GetParameters",
"ssm:GetParameter"
],
"Effect": "Allow",
"Resource": "arn:aws:ssm:$region:parameter/github-action-runners/runners/tokens*"
},
With the paths allowed access being parameter/github-action-runners/runners/tokens*
, note the *
.
When the scale_up
lambda processes the SQS messages coming from the webhook
lambda, it adds the runner config or the JIT config in the parameter store under the same path that the IAM policy statement allows runners to read from.
Once an instance has been created scale_up
will try to create the runner's config. Based on how this is configured it will wither try to createRegistrationTokenConfig
or createJitConfig
. For both cases, the lambda writes the config to the SSM parameter store path:
async function createStartRunnerConfig(
githubRunnerConfig: CreateGitHubRunnerConfig,
instances: string[],
ghClient: Octokit,
) {
if (githubRunnerConfig.enableJitConfig && githubRunnerConfig.ephemeral) {
await createJitConfig(githubRunnerConfig, instances, ghClient);
} else {
await createRegistrationTokenConfig(githubRunnerConfig, instances, ghClient);
}
}
createRegistrationTokenConfig:
async function createRegistrationTokenConfig(
(...)
await putParameter(`${githubRunnerConfig.ssmTokenPath}/${instance}`, runnerServiceConfig.join(' '), true);
createJitConfig:
async function createJitConfig(githubRunnerConfig: CreateGitHubRunnerConfig, instances: string[], ghClient: Octokit) {
(...)
await putParameter(`${githubRunnerConfig.ssmTokenPath}/${instance}`, runnerConfig.data.encoded_jit_config, true);
scale_up
places each instance's token under the ${githubRunnerConfig.ssmTokenPath}/
path but also adds the instance ID at the end of the path: ${githubRunnerConfig.ssmTokenPath}/${instance}
.
As established earlier, these instances all have permission to read every parameter under ${githubRunnerConfig.ssmTokenPath}/*
(remember the *
), but they do not have permission to list parameters under that path. Therefore, instances cannot find the tokens meant for other instances by querying the SSM store by querying the parameter store.
However, each instance does have access to it's own instance ID from the AWS metadata service. As a result, an instance can query the metada service to get its own instance ID and then use it to retrieve the token from the parameter store. This is also how the install-runner
scripts work:
token=$(curl -f -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 180" || true)
(...)
instance_id=$(curl -f -H "X-aws-ec2-metadata-token: $token" -v http://169.254.169.254/latest/meta-data/instance-id)
(...)
config=$(aws ssm get-parameter --name "$token_path"/"$instance_id" --with-decryption --region "$region" | jq -r ".Parameter | .Value")
The problem arises with the assumption that each instance does not know the instance ID of the other instances running in the same environment.
Another IAM policy assigned to the role attached to these instances allows instances to read EC2 tags for all EC2 instances. describe_tags
resource "aws_iam_role_policy" "describe_tags" {
name = "runner-describe-tags"
role = aws_iam_role.runner.name
policy = file("${path.module}/policies/instance-describe-tags-policy.json")
}
Policy template file instance-describe-tags-policy.json:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "ec2:DescribeTags",
"Resource": "*"
}
]
}
The above policy allows principals to use the EC2 DescribeTags
API. Part of this API's response is the tagSet that contains TagDescription objects. As seen in the linked documentation, part of this object's contents is the resourceId
of the resource that the tag is applied to.
Indeed:
$ aws ec2 describe-tags --region "us-east-2" --filter "Name=resource-type,Values=instance"
{
"Tags": [
{
"Key": "Name",
"ResourceId": "i-07dXXXXXX231eab",
"ResourceType": "instance",
"Value": "gh-runners-xxxxxx"
},
(...)
Since this policy is associated with the instances, it is possibles for instances to describe tags, allowing them to retrieve ResourceId
s and then use this ResourceId
to request against the SSM Parameter store so that it receives the tokens meant for other instances.
The path that needs to be accessed would look like /github-action-runners/gh-runners/runners/tokens/i-07dXXXXXX231eab
and we now know that instances are allowed to get all parameters under path: /github-action-runners/gh-runners/runners/tokens/*
There are a few time constraints to exploiting this. For JIT config, the instance executing a malicious workflow will have to use the config before the legitimate instance since these are meant to be valid for one-time use only. For runner registration tokens, the instance executing a malicious workflow will have to read the token before the legitimate instance deletes it from the parameter store after it's read.
With that token exposed, arbitrary runners can register themselves to receive jobs from the relevant repo or org.
Impact
The successful exploitation of the overly permissive IAM policies undermines the segregation between runners operating in the same environment.
For exploitation to occur, the adversary must at least be able to control or otherwise successfully modify a workflow in a repo or org that employs this infrastructure for self-hosted actions. This ability could come in a multitude of ways:
- Misconfigured public repos that allow workflows from fork PRs to run without approval
- Careless maintainers approving modified workflow runs,
- Adversaries initially posing as legitimate contributors trying to exploit the GH protection that is set to not require workflow approval for Fork PRs from previous contributors
- Leaked PAT and fine-grained GitHub tokens
- Malicious contributors trying to move laterally in the supply chain
The outcome of successful exploitation heavily depends on how the infrastructure is utilized. The compromised segregation between runners can lead to a breach of confidentiality, particularly in cases where GitHub secrets are used, and also impact the integrity of other workflows operating within the same infrastructure.
The CVSS calculation below has been made with the assumption that a maintainer will have to approve the workflow hence User Interaction (UI) is required and that a successful attack will impact components outside the vulnerable component like other repositories or runner groups hence scope is considered changed.
Summary
An overly broad IAM policy statement grants all EC2 runner instances unrestricted read access to GitHub runner registration tokens and JIT config stored in the AWS SSM Parameter Store. This misconfiguration permits any runner instance to access tokens intended for other runners, potentially compromising the integrity of workflows and the confidentiality of GitHub Secrets accessible to jobs. Refinement of the IAM policy assigned to EC2 runners is necessary to ensure strict, least privilege access controls.
Details
The infrastructure designed for self-hosted runners of GitHub Actions leverages a series of AWS Lambda functions to dynamically scale runners in response to GitHub Action events. It utilizes SQS for message passing between Lambdas and an SSM Parameter Store for secure distribution of configuration values, eliminating the need to directly share secrets of varying sensitivity across instances. This setup underpins robust isolation between runners and workflows. However, an issue arises with the policy associated with the EC2 runners' role, which inadvertently grants access to GitHub runner registration tokens and JIT configurations meant for other runners.
The following statement, part of the role policy assigned to EC2 runners, allows runners assigned to this role to read parameters under the
${arn_ssm_parameters_path_tokens}*
path of the parameter store.Unlike some role policies in this module, this one is attached unconditionally to
"${var.prefix}-runner-role"
role:The
${arn_ssm_parameters_path_tokens}
Terraform variable passed to the policy template file is configured based on thessm_paths
variable passed to the module, considerating region and account id. As shown in the snippet above, the parameter path will be constructed usingssm_paths.root
andvar.ssm_paths.tokens
:(Note: This path is slightly different for the multi-runner module. For simplicity, in this description, we will focus on the single runner type and describe the modified risk for multiple runner types later.)
The default values for these vars are passed to the runners child module by the root module, as seen in the snippets below:
So for the default variable values the IAM policy statement will look like:
With the paths allowed access being
parameter/github-action-runners/runners/tokens*
, note the*
.When the
scale_up
lambda processes the SQS messages coming from thewebhook
lambda, it adds the runner config or the JIT config in the parameter store under the same path that the IAM policy statement allows runners to read from.Once an instance has been created
scale_up
will try to create the runner's config. Based on how this is configured it will wither try tocreateRegistrationTokenConfig
orcreateJitConfig
. For both cases, the lambda writes the config to the SSM parameter store path:createRegistrationTokenConfig:
createJitConfig:
scale_up
places each instance's token under the${githubRunnerConfig.ssmTokenPath}/
path but also adds the instance ID at the end of the path:${githubRunnerConfig.ssmTokenPath}/${instance}
.As established earlier, these instances all have permission to read every parameter under
${githubRunnerConfig.ssmTokenPath}/*
(remember the*
), but they do not have permission to list parameters under that path. Therefore, instances cannot find the tokens meant for other instances by querying the SSM store by querying the parameter store.However, each instance does have access to it's own instance ID from the AWS metadata service. As a result, an instance can query the metada service to get its own instance ID and then use it to retrieve the token from the parameter store. This is also how the
install-runner
scripts work:The problem arises with the assumption that each instance does not know the instance ID of the other instances running in the same environment.
Another IAM policy assigned to the role attached to these instances allows instances to read EC2 tags for all EC2 instances. describe_tags
Policy template file instance-describe-tags-policy.json:
The above policy allows principals to use the EC2
DescribeTags
API. Part of this API's response is the tagSet that contains TagDescription objects. As seen in the linked documentation, part of this object's contents is theresourceId
of the resource that the tag is applied to.Indeed:
Since this policy is associated with the instances, it is possibles for instances to describe tags, allowing them to retrieve
ResourceId
s and then use thisResourceId
to request against the SSM Parameter store so that it receives the tokens meant for other instances.The path that needs to be accessed would look like
/github-action-runners/gh-runners/runners/tokens/i-07dXXXXXX231eab
and we now know that instances are allowed to get all parameters under path:/github-action-runners/gh-runners/runners/tokens/*
There are a few time constraints to exploiting this. For JIT config, the instance executing a malicious workflow will have to use the config before the legitimate instance since these are meant to be valid for one-time use only. For runner registration tokens, the instance executing a malicious workflow will have to read the token before the legitimate instance deletes it from the parameter store after it's read.
With that token exposed, arbitrary runners can register themselves to receive jobs from the relevant repo or org.
Impact
The successful exploitation of the overly permissive IAM policies undermines the segregation between runners operating in the same environment.
For exploitation to occur, the adversary must at least be able to control or otherwise successfully modify a workflow in a repo or org that employs this infrastructure for self-hosted actions. This ability could come in a multitude of ways:
The outcome of successful exploitation heavily depends on how the infrastructure is utilized. The compromised segregation between runners can lead to a breach of confidentiality, particularly in cases where GitHub secrets are used, and also impact the integrity of other workflows operating within the same infrastructure.
The CVSS calculation below has been made with the assumption that a maintainer will have to approve the workflow hence User Interaction (UI) is required and that a successful attack will impact components outside the vulnerable component like other repositories or runner groups hence scope is considered changed.