BuildKite CI: Deployed buildkite agents can download and upload artifacts to gcloud #4804

bkase · 2020-04-29T00:08:07Z

See https://buildkite.com/docs/agent/v3/gcloud#uploading-artifacts-to-google-cloud-storage

The JSON credentials approach is nice because we would also be able to support artifact upload/download on local developers' machines trivially.

Depends on #4802

O1ahmad · 2020-05-07T23:06:59Z

Initial thinking around implementation:

Steps:

create GCP service account for Buildkite Agent Kubernetes cluster during Terraform provisioning
store service account as Kubernetes secret
provision Buildkite agent pods with volume-sidecar containing environment hook which exposes:

$BUILDKITE_GS_APPLICATION_CREDENTIALS_JSON=<value-of-stored-secret>
$BUILDKITE_ARTIFACT_UPLOAD_DESTINATION="gs://<ci_cd_bucket>/$BUILDKITE_JOB_ID (target store location)

Result: Buildkite Agents should be able to run ANY job which requires access to Google Cloud Storage as long as the job key is prefixed with upload or download

1. Create Buildkite service account credentials

resource "google_service_account" "buildkite_gcs_account" {
  account_id   = "buildkite-${var.cluster_name}"
  display_name = "Buildkite agent GCS service account -- cluster: ${var.cluster_name}"
  project = "o1labs-192920"
}

1. Store service account credentials as Kubernetes secret

resource "google_service_account_key" "buildkite_gcs_key" {
  service_account_id = google_service_account.buildkite_gcs_account.name
}

resource "kubernetes_secret" "google-application-credentials" {
  metadata {
    name = "google-application-credentials"
    namespace = "${var.cluster_namespace}"
  }
  data = {
    "credentials.json" = base64decode(google_service_account_key.buildkite_gcs_key.private_key)
  }
}

1. Store Buildkite service account credentials as Pod secret volume

apiVersion: v1
kind: Pod
metadata:
  name: buildkite-agent
spec:
  containers:
  - name: buildkite-agent
    image: buildkite/agent
    env: 
    - name: BUILDKITE_HOOKS_PATH
      value: {{ $.Values.buildkite_hooks_path }} 
    volumeMounts:
    - name: credentials-json
      mountPath: "/etc/credentials.json"
      readOnly: true
  volumes:
  - name: credentials-json
    secret:
      secretName: google-application-credentials

** also may be able to use the official Buildkite Helm Chart's extraEnv Value for specifying the $BUILDKITE_HOOKS_PATH and potentially $BUILDKITE_GS_APPLICATION_CREDENTIALS_JSON EnvVars.

3-b. Either provide Buildkite agent pod with /hooks/environment file mount or build script into Buildkite agent Docker image:

$ cat /hooks/environment

#!/bin/bash
set -euo pipefail

if [[ "$BUILDKITE_STEP_KEY" ~= ^upload|download ]]; then
  export BUILDKITE_GS_APPLICATION_CREDENTIALS_JSON="$(cat /etc/credentials.json)"
  export BUILDKITE_ARTIFACT_UPLOAD_DESTINATION="gs://<ci_cd_bucket>/${BUILDKITE_JOB_ID}"
fi

O1ahmad · 2020-05-08T00:01:39Z

[reference] buildkite agent hooks: https://buildkite.com/docs/agent/v3/hooks#available-hooks

environment | Agent, Plugin | Runs before all other hooks. Useful for exposing secret keys and adding strict checks.
-- | -- | --

O1ahmad · 2020-05-08T15:33:37Z

(step 3 - from above) iii. Terraform -- Buildkite Helm chart config and provisioning of Kubernetes Pods:

# configure the Buildkite provider - note: operator should have API token env var exposed
provider "buildkite" {
    #  api_token = "token" -- SHOULD be set from env: **BUILDKITE_API_TOKEN**
    organization = "O(1) Labs"
}

# create an agent token with an optional description
resource "buildkite_agent_token" "agent_token" {
    description = "default Buildkite agent token"
}

buildkite_agent_vars = {
    numAgents             =   var.num_agents
    labelOffset              =   var.agent_offset
    image = {
      repository              =   var.image_repository
      tag                         =   var.image_tag
      pullPolicy               =   var.image_pullPolicy
    }
    agentToken              =   agent_token.token
    agentMeta                =   var.agent_meta
    agentHooksPath      =   var.agent_hooks_path
    volumeMounts = [
      {
          name = "credentials-json"
          value  = "/etc/credentials.json"
      }
   ]
   volumes = [
      {
          name = "credentials-json"
          secret = {
              secretName = "google-application-credentials"
          }
      }
   ]
}

resource "kubernetes_namespace" "buildkite_agent_namespace" {
  metadata {
    name = var.cluster_name
  }
}

 resource "helm_release" "buildkite_agent" {
  name      = "buildkite"
  repository = "https://buildkite.github.io/charts/" 
  chart     = "bk-agent"
  namespace = kubernetes_namespace.buildkite_agent_namespace.metadata[0].name
  values = [
    yamlencode(local.buildkite_agent_vars)
  ]
  wait       = false
}

O1ahmad · 2020-05-08T15:38:18Z

Note:

Based on the official Buildkite Docker image, we can either mount hooks into the container at runtime (-- which forces the question of where to source the buildkite hooks file/directory from --) or copy them into the /buildkite/hooks directory (and ensure they are executable) during image build.

I generally favor maintaining a custom O(1) image to:

avoid the clutter of adding more (and more) mounts to our Helm chart Kubernetes config
allow managed and more testable customization/tweaks we'd like in the future.

# Re: Mounting - pretty simple for running and testing locally though more unwieldy and complicated via Kubernetes
docker run -it \
  -v "$HOME/buildkite-hooks:/buildkite/hooks:ro" \
  buildkite/agent:3

# Alternatively, if we create our own image based off `buildkite/agent`,
# we can copy our CI/CD hooks into the correct location by default and also enable
# developers/operators to mount additional hooks for developmental purposes:

$ cat Dockerfile-buildkite-agent
# Example O(1) labs custom Buildkite Agent Docker image
FROM buildkite/agent:3

COPY hooks /buildkite/hooks/

bkase · 2020-05-08T20:28:42Z

@0x0i what is the benefit of using the official buildkite agent docker image vs. just including the buildkite-agent package in whatever environment we have.

I forsee this being somewhat difficult to compose with our existing toolchain image -- it would be much easier to apt install buildkite-agent in this image in the short term, unless that trick you mentioned about merging from multiple dependent docker images could work here too.

bkase · 2020-05-08T20:30:27Z

@0x0i also this approach seems good to me terraform -> helm -> etc (may be worth quickly chatting with @yourbuddyconner and @nholland94 to see how it could or should fit in with the other dhall stuff) -- in the short (today), term do you know of a quick way I can get a local agent to talk to gcloud so it can unblock some pipelines I want to write?

O1ahmad · 2020-05-08T20:59:26Z

@0x0i what is the benefit of using the official buildkite agent docker image vs. just including the buildkite-agent package in whatever environment we have.

@bkase : so we'll likely use the official Buildkite Helm chart for deploying the Buildkite agents across the Kubernetes cluster though that's still in discussion (#4803) due to how it fits in with our current infrastructure stack and since the Helm chart seems solidly built.

Based on if we do, seems to make sense to leverage an image with proper packaging, etc. (either our own, the official image, a hybrid of both, etc) rather than creating a base image and adding/maintaining the installation of necessary packages ourselves.

in the short (today), term do you know of a quick way I can get a local agent to talk to gcloud so it can unblock some pipelines I want to write?

You should be able to properly set the environment variables mentioned earlier (e.g. $BUILDKITE_GS_APPLICATION_CREDENTIALS_JSON) to what you use locally and make use of the gcloud cli. As far as I can tell though, I don't think we currently have a GCS account... @yourbuddyconner ¯_(ツ)_/¯

O1ahmad · 2020-05-11T15:03:08Z

@bkase, @yourbuddyconner:

all of the existing Buildkite agents should be able to handle jobs which attempt to upload/download from our GCS repo.
I'm working on creating the Terraform bits in coda-automation but you can unblock/experiment yourself using my minikube prototype cluster (existing k8s cluster ^^^) or one of your own:

# MacOS installation
brew install minikube helm
minikube start   # should automatically choose *docker* virtualization but if not, run: minikube start --driver=docker
helm install <cluster/release-name> buildkite/agent --set privateSshKey="$(cat </path/to/your/github/private-key>)" -f <helm-buildkite-values.yaml>

Note: most of the configuration defaults should be fine except:

agent.token (or set $BUILDKITE_AGENT_TOKEN) to your token
for upload/download capabilities, you'll want to add the following to your <helm-buildkite-values.yaml>:

extraEnv:
  - name: BUILDKITE_GS_APPLICATION_CREDENTIALS_JSON
    value: '<contents-of-gs-app-creds.json>'

O1ahmad · 2020-05-11T15:44:44Z

@bkase @yourbuddyconner Thinking we shouldn't have to worry about developing and maintaining our own Docker image for the buildkite-agents, at least for the purpose of setting buildkite-agent hooks since:

sensitive environment configuration can be injected at build time and maintained and secured within individual pod runtimes without environment hooks (via Terraform)
sensitive environment configs can be modified infra-wide or targeting specific pods without hooks post provisioning (via kubectl)
individual cluster builders/operators can choose to provision new clusters with hooks mounted or included within pre-initialized images if desired

yourbuddyconner · 2020-05-11T19:10:03Z

@0x0i good call-out, I concur that we probably won't need to extend beyond the built-in functionality of the buildkite agent in the initial implementation.

O1ahmad · 2020-05-13T19:52:41Z

Resolved by: https://github.com/CodaProtocol/coda-automation/pull/377

bkase added Size: XS ~ 1 day Taskforce Iterate (C) labels Apr 29, 2020

This was referenced Apr 29, 2020

BuildKite CI: Build-artifacts-medium-curves pipeline #4806

Closed

BuildKite jobs in Docker #4824

Merged

bkase assigned O1ahmad May 6, 2020

O1ahmad closed this as completed May 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BuildKite CI: Deployed buildkite agents can download and upload artifacts to gcloud #4804

BuildKite CI: Deployed buildkite agents can download and upload artifacts to gcloud #4804

bkase commented Apr 29, 2020

O1ahmad commented May 7, 2020 •

edited

Loading

O1ahmad commented May 8, 2020

O1ahmad commented May 8, 2020 •

edited

Loading

O1ahmad commented May 8, 2020 •

edited

Loading

bkase commented May 8, 2020

bkase commented May 8, 2020

O1ahmad commented May 8, 2020 •

edited

Loading

O1ahmad commented May 11, 2020

O1ahmad commented May 11, 2020

yourbuddyconner commented May 11, 2020

O1ahmad commented May 13, 2020

BuildKite CI: Deployed buildkite agents can download and upload artifacts to gcloud #4804

BuildKite CI: Deployed buildkite agents can download and upload artifacts to gcloud #4804

Comments

bkase commented Apr 29, 2020

O1ahmad commented May 7, 2020 • edited Loading

O1ahmad commented May 8, 2020

O1ahmad commented May 8, 2020 • edited Loading

O1ahmad commented May 8, 2020 • edited Loading

bkase commented May 8, 2020

bkase commented May 8, 2020

O1ahmad commented May 8, 2020 • edited Loading

O1ahmad commented May 11, 2020

O1ahmad commented May 11, 2020

yourbuddyconner commented May 11, 2020

O1ahmad commented May 13, 2020

O1ahmad commented May 7, 2020 •

edited

Loading

O1ahmad commented May 8, 2020 •

edited

Loading

O1ahmad commented May 8, 2020 •

edited

Loading

O1ahmad commented May 8, 2020 •

edited

Loading