Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for default volume mounts #157

Merged
merged 2 commits into from
Jun 28, 2018

Conversation

asford
Copy link
Contributor

@asford asford commented Jun 17, 2018

Add support for the generic environment variable BUILDKITE_DOCKER_DEFAULT_VOLUMES, specifying a set of volume mounts that are always passed into docker-compose run invocations.

This feature is intended to support scenarios where the buildkite-agent is run via docker-compose with the BUILDKITE_BUILD_PATH located on a external volume mount. BUILDKITE_DOCKER_DEFAULT_VOLUMES can the be used to ensure that this mount is available in docker-compose based pipeline steps, giving simplified access to the build directory.

For a working example see uw-ipd/buildkite-agent.

My hope is that a feature of this type is elevated to a roughly standard status for buildkite-agent-within-docker configurations and that BUILDKITE_DOCKER_DEFAULT_VOLUMES can be adopted by docker-based pipeline plugins as a mechanism to pass state between the build command and pipeline-based hook implementations in cases where build artifacts are not an suitable choice. For example, our github-checks integration plugin will adopt this feature to ensure that build outputs are available as check output.

Copy link
Contributor

@toolmantim toolmantim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for much for the PR!

I think this is a really useful option to add, for the sorts of use cases you mentioned.

One major change though, if you don't mind making it, is can we change it a standard option, without having to introduce a new env var prefix?

In the option description perhaps we can document that it's most commonly used as an environment variable on the agent machine?

@lox
Copy link
Contributor

lox commented Jun 23, 2018

Apologies if I've missed something, but how does this differ from https://github.com/buildkite-plugins/docker-compose-buildkite-plugin#volumes-optional-run-only in the use-cases it addresses?

Is it just that it's designed to be agent level vs pipeline level?

@asford
Copy link
Contributor Author

asford commented Jun 23, 2018

@toolmantim I'm happy to change the variable name and/or configuration flow. I'm not sure I fully understand what you mean by "change it as a standard option". Would that mean naming it as BUILDKITE_PLUGIN_DOCKER_COMPOSE_DEFAULT_VOLUMES, and then specifying default_volumes in the plugin.yml as an optional parameter? In that case, I assume that the value will be passed through from the agent environment if it is not specified in the step's plugin configuration?

My, potentially minor, concern with that idea is that this configuration value isn't tightly coupled to docker-compose. We'll have several pipeline plugins that use this this configuration value in their own docker invocations, and they may not be using compose. This issue is really more aesthetic than technical, and I entirely appreciate the consideration of not introducing an additional environment variable prefix. If you're +1 on the BUILDKITE_PLUGIN_DOCKER_COMPOSE_DEFAULT_VOLUMES interpretation above then I'm happy to make the change.

@asford
Copy link
Contributor Author

asford commented Jun 23, 2018

@lox Yes, this is intended to provide an additional layer of agent level volume configuration, rather than the pipeline level configuration in the plugin volumes option.

@toolmantim
Copy link
Contributor

Thanks for the full explanation!

Yep, I was suggesting to rename it to the standard variable names, just because then it's behaviour matches all the config options (which can also be set via env vars), if we add env/yaml validation it'll work with it, and if we auto-gen docs based on plugin.yml it won't be left out. And it's just the thing people are familiar with.

It'd mean that in your agent hook, you'd need to do something like this:

DEFAULT_VOLUMES="/buildkite:/buildkite;./app/app"

export BUILDKITE_PLUGIN_DOCKER_COMPOSE_DEFAULT_VOLUMES="${DEFAULT_VOLUMES}"
export BUILDKITE_PLUGIN_DOCKER_DEFAULT_VOLUMES ="${DEFAULT_VOLUMES}"
# any other plugins

Is that okay?

If there was some way for plugins to access hook exported env vars, separately to pipeline.yml settings, then maybe we wouldn't need to this additional DEFAULT_{option} property. But I don't think that's currently possible, they're just overridden and no way to access the old value. Adding a DEFAULT_{option} for picking up values set from hooks, is probably a legit pattern in general for this.

We should probably add the (semi-colon delimited) string support to the existing volumes option too though, whilst we're here though?

My vote as a way forward would be for each of the Docker plugins to:

  • Add a new default-volumes option. In the doc for it, we should make sure to mention the env var name (or an example).
  • Support a semi-delimited string value for both volume and default-volumes options

@lox
Copy link
Contributor

lox commented Jun 24, 2018

After some thought, I wonder whether this change is at the right layer of abstraction. I suspect there should be a distinction between agent-wide docker customizations and docker settings for a particular step. I'd always imagined that we'd apply them via a custom docker bootstrap and a docker socket proxy like https://github.com/buildkite/sockguard.

What if we added a way to ensure certain volumes are mounted in sockguard, which you would set up in a custom bootstrap like https://github.com/buildkite/docker-bootstrap-example, or a pre-command hook in a plugin. The invocation might look like:

sockguard \
  --upstream-socket /var/run/docker.sock 
  --volume /my/llamas/dir:/llamas &

export DOCKER_HOST="unix://$PWD/sockguard.sock"

Then docker invocations subsequently in the job would always include the /llamas volume.

I think using sockguard for agent-wide customizations of the docker environment will keep configuration manageable, where as adding things like agent-wide settings or defaults for lots of different plugin config will get complicated quickly. Beyond that, it gives us lots of room for good security improvements.

@asford
Copy link
Contributor Author

asford commented Jun 24, 2018

That sounds like an reasonable solution to me; I wasn't aware of sockguard.

My experience with docker & docker-compose isn't exhaustive, particularly as we're getting into territory like proxying the docker socket. Just to restate/clarify, to make sure I'm not missing something:

At a high level, my goal is to enable a solution that enables build commands and build plugins to run within docker containers with a shared build directory state. This is roughly equivalent to the container builder execution model, in which every build step runs in separate container with a shared /workspace directory. In our environment the buildkite-agent may also be running with a container, however we don't want to couple the solution to that assumption.

We've solved this at uw-ipd by creating an external buildkite volume on the host machine, and then mount this external volume at /buildkite in out agent container as well as our build command containers:

     +-----------------+
     | Agent Container |
     +-----------------+
          |        |
          |        | Invokes with docker-compose
          |        |
          |        |     +-------------------+
          |        +-----> Command Container +--+
  Mounts  |        |     +-------------------+  |
          |        |                            |
          |        |     +-------------------+  |
          |        +-----> Plugin Container  +-->
          |              +-------------------+  |
          |                                     |
          |                                     |
    +-----v--------------+     Mounts           |
    | Build State Volume <----------------------+
    +--------------------+

We need this mount process to be "transparent" from the level of pipeline.yml. The pipeline author should only need to state that the build step runs within the build working directory, and shouldn't be aware of whether this directory is a volume mount or a host directory bind mount. We're using this pull as currently implemented at 2766c17 to pass this volume mount configuration from the agent environment into the command invocations.

We're currently developing several "reporting" style pipeline plugins, which will execute the plugin steps within a defined container. Several buildkite plugins adopt this strategy, however the majority assume that the buildkite-agent is running directly on the host and attempt to bind-mount the build directory. This assumption is often incompatible with agent-within-docker configurations.

Current Solution at uw-ipd/buildkite-agent

  1. An external buildkite volume is created to contain build state.
  2. buildkite-agent runs within container, mounting buildkite:/buildkite, defining BUILDKITE_DOCKER_DEFAULT_VOLUMES=buildkite:/buildkite, and creates build working directories under /buildkite/builds.
  3. The docker-compose command execution plugin reads BUILDKITE_DOCKER_DEFAULT_VOLUMES, making the build working directory available to command containers via the buildkite:/buildkite volume mount.
  4. Custom plugins also read BUILDKITE_DOCKER_DEFAULT_VOLUMES and execute plugin logic via docker-compose, also mounting buildkite:/buildkite volume mount.

Renamed environment variables (@toolmantim's)

  1. An external buildkite volume is created to contain build state.
  2. buildkite-agent runs within container, mounting buildkite:/buildkite, defining BUILDKITE_PLUGIN_[DOCKER_COMPOSE|DOCKER|...]_DEFAULT_VOLUMES=buildkite:/buildkite, and creates build working directories under /buildkite/builds.
  3. The docker-compose command execution plugin reads BUILDKITE_PLUGIN_DOCKER_COMPOSE_DEFAULT_VOLUMES, making the build working directory available to command containers via the buildkite:/buildkite volume mount.
  4. Custom plugins read BUILDKITE_PLUGIN_[...]_DEFAULT_VOLUMES (or just read BUILDKITE_PLUGIN_DOCKER_COMPOSE_DEFAULT_VOLUMES) and execute plugin logic via docker-compose, also mounting buildkite:/buildkite volume mount.

sockguard-based volume mounts (@lox's)

  1. An external buildkite volume is created to contain build state.
  2. buildkite-agent runs within container, mounting buildkite:/buildkite and creates build working directories under /buildkite/builds. An agent sets DOCKER_HOST to a sockguard socket, configured to add the buildkite:/buildkite volume mount to all invocations.
  3. The docker-compose command execution plugin executes normally, sockguard adds buildkite:/buildkite volume mount.
  4. Custom plugins execute plugin logic via docker-compose, sockguard adds the buildkite:/buildkite volume mount.

@asford
Copy link
Contributor Author

asford commented Jun 24, 2018

@lox

I see the advantage of making the additional volume mount transparent from the perspective of command/plugin docker invocations, however I'm not a huge fan of the level of indirection this adds to the system. This means that proper execution of the build command will be tightly, and implicitly, coupled to sockguard adding an volume/host mount. In general, I think the advantage of the docker-compose paradigm is that state dependencies are explicit, even if more verbose.

I'd also be hesitant to overload sockguard's semantics to move from a permissions guard to a tool with modifies the container's runtime environment. There are already many ways of configuring a container runtime (compose files, compose file overrides and command line flags, ...) and I'd prefer to use one of those solutions if possible.

@toolmantim
Copy link
Contributor

Interesting @lox. I always thought sockguard was more for doing the opposite of what we’re doing here: hiding a sensitive volume mount from a Docker command in a build job, when running untrusted builds. Rather than adding a host volume for trusted builds, to help with shared state. 🤔

@toolmantim
Copy link
Contributor

I’m not really sure what’s more correct. Maybe both are?

Perhaps it’s worth just moving ahead with this small change in the meantime, and we can always relook at it in a world when peeps are using sockguard?

@lox
Copy link
Contributor

lox commented Jun 24, 2018

Yup, after some more thinking, I think I agree that sockguard isn't quite the right spot for it. It seems like the root problem here is that we are seeing lots of plugins use a volume mount of $PWD to mount the working directory into a container, but that requires whatever $PWD is to be accessible from the docker host, which it frequently isn't, e.g when the agent is running in a container.

I think I agree that there should be some sort of convention for overriding that in plugins that use docker @asford. What if rather than the generic BUILDKITE_DOCKER_DEFAULT_VOLUMES we had something more specific like BUILDKITE_BUILD_CHECKOUT_DOCKER_MOUNT, which would by default (or when not set) be BUILDKITE_BUILD_CHECKOUT_PATH, but could be overridden for when you have your checkouts in a docker volume (that will apply to kubernetes too).

For instance https://github.com/buildkite-plugins/golang-buildkite-plugin/blob/e5624da1a545afd740adfa8e9cae745a85c3fd1c/hooks/command#L14 would become:

args=(
  "--rm"
  "--volume" "${BUILDKITE_BUILD_CHECKOUT_DOCKER_MOUNT:-$PWD}:/go/src/${import}"
  "--workdir" "/go/src/${import}"
)

Then we'd have to add support to every plugin for that specific env, and probably a linter check too. Almost all of the plugins in existence now basically don't work unless $PWD in the agent container can be volume mounted into the running container.

Not sure how we apply that same fix for other things that plugins try and mount in, for instance /usr/local/bin/buildkite-agent and /usr/local/bin/docker. That's kind of why my thinking was the problem needed to be solved at a docker socket level, for instance with path rewriting or similar.

It's gross either way 🤷🏼‍♂️

@lox
Copy link
Contributor

lox commented Jun 24, 2018

Worth noting, I reckon for the model you are talking about @asford, you absolutely want a container per job-run using the bootstrap like http://github.com/buildkite/docker-bootstrap-example.

@toolmantim
Copy link
Contributor

I don’t want to block simply adding another option to the plugin which lets you set up some default values for the volumes option at a hook level, in a generic way.

The /buildkite/builds mount and $PWD sound like bigger problems?

It’d be adding this to both Docker plugins:

  • Add a new default-volumes option
  • Support a boolean false value for default-volumes so it can be disabled per-job
  • Support semi-colon delimited string versions of both volumes and default-volumes options

I like your idea @lox of a default bind mount environment variable convention that we can use everywhere. I’m happy to ship that too, across the board. But probably just in addition to the above changes?

@lox
Copy link
Contributor

lox commented Jun 27, 2018

Yup yup, I'm interested to hear what you think @asford of whether BUILDKITE_BUILD_CHECKOUT_DOCKER_MOUNT as a general thing across docker related plugins would solve your problem vs default volumes. It's possible it's just a slightly different name for the thing you initially suggested?

@asford
Copy link
Contributor Author

asford commented Jun 27, 2018

The BUILDKITE_BUILD_CHECKOUT_DOCKER_MOUNT solves the immediate problem I've outline above, but I don't think it's a general solution to the "build-agent-within-container spawning build-[step|plugin]-within-container" problems I can envision in the near future.

There are a number of different bits of state that may need to be available within the step container including:

  1. The build directory.
  2. The buildkite-agent binary (as you identified above), configuration (if not fully specified by env vars) and hooks.
  3. Build-related secrets (external service keys, access tokens, ...).
  4. ¯\_(ツ)_/¯ who knows what else? (In our app, for example, we may need access to 60+ gb sequence or biomolecular structure databases on a remote volume.)

My goal with the /buildkite mount solution above is to store everything buildkite-related (checkout, agent binary, hooks, ...) under a directory prefix that is shared across all containers, so that paths to build components (including `pwd`, `which buildkite-agent` and $BUILDKITE_BUILD_CHECKOUT_PATH) are valid in all containers. By storing /buildkite/builds, /buildkite/bin, and /buildkite/hooks in the buildkite volumes I'm able to resolve (1) & (2) with a single mount.

My intention with the default volumes solution is to solve (3) and (4) above by adding the appropriate secrets and database mounts to the default mounts, and configure environment variables pointing to the corresponding paths.

To be specific, our end value of BUILDKITE_PLUGIN_[DOCKER_COMPOSE|DOCKER|...]_DEFAULT_VOLUMES will likely be something akin to buildkite:/buildkite; buildkite-secrets:/buildkite-secrets; refseq:/database/reqseq; pdb:/database/pdb;, with refseq and pdb provided by a network volume driver.

Admittedly, this is getting into a relatively specialized configuration but that's where Buildkite really shines.

@lox I believe that this approach will work over kubernetes, but as I'm not running on kube I'm not certain. Would you mind commenting on that?

@lox
Copy link
Contributor

lox commented Jun 28, 2018

The BUILDKITE_BUILD_CHECKOUT_DOCKER_MOUNT solves the immediate problem I've outline above, but I don't think it's a general solution to the "build-agent-within-container spawning build-[step|plugin]-within-container" problems I can envision in the near future.

Yup, I hear you. I'm keen to see how your vision on this plays out, so let's 🚢it.

@toolmantim my take would be to roll with BUILDKITE_DOCKER_DEFAULT_VOLUMES vs a plugin property. I think this constitutes behaviour that is configured externally to a plugin, as it will be supported by many plugins.

@lox
Copy link
Contributor

lox commented Jun 28, 2018

One random thing @asford, we discovered the other day that docker-compose doesn't support relative paths in volume mounts you provide over the cli.

@toolmantim
Copy link
Contributor

Yep! In that case, let’s :shipit:

@toolmantim
Copy link
Contributor

Thank you for helping us talk it through @asford! 🙏🏼

@toolmantim toolmantim changed the title Add support for environment-specific default volume mounts. Add support for default volume mounts Jun 28, 2018
@toolmantim toolmantim merged commit cdbd208 into buildkite-plugins:master Jun 28, 2018
@lox
Copy link
Contributor

lox commented Jun 29, 2018

Failed in CI:

image

@toolmantim
Copy link
Contributor

toolmantim commented Jun 29, 2018

I wondered what that line was up to 🤔

@asford
Copy link
Contributor Author

asford commented Jul 2, 2018

Yikes, sorry about that. I was just running the bats tests locally.

Fixed the underlying issue in #159.

Shameless plug, but y'all should consider using this github checks buildkite plugin so external contributors can see CI output from pulls.

@toolmantim
Copy link
Contributor

Thanks for the fix @asford! We actually don't run builds for third-party pull requests at all, so there'd be no plugins to run. But we're close to getting there soon.

Another shameless plug though, have you checked out bksr? You can bksr --all locally and run all the pipeline steps.

@toolmantim
Copy link
Contributor

This was released in v2.5.0 (✈️ Your Flight Has Been Cancelled)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants