Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dockerfile: implement hooks for RUN instructions #4669

Closed
wants to merge 4 commits into from

Conversation

AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Feb 19, 2024

Close #4576


e.g.,

buildctl build \
  --frontend dockerfile.v0 \
  --opt hook="$(cat hook.json)"

with hook.json as follows:

{
  "RUN": {
    "entrypoint": ["/dev/.dfhook/entrypoint"],
    "mounts": [
       {"from": "example.com/hook", "target": "/dev/.dfhook"},
       {"type": "secret", "source": "something", "target": "/etc/something"}
    ]
  }
}

This will let the frontend treat RUN foo as:

RUN \
  --mount=from=example.com/hook,target=/dev/.dfhook \
  --mount=type=secret,source=something,target=/etc/something \
  /dev/.dfhook/entrypoint foo

docker history will still show this as RUN foo.

Buildx integration

To specify --opt via buildx, see:

Eventually buildx should have a proper --hook=<FILE> option.
Probably, it should also read ~/.docker/buildx/hooks/*.json by default.

Use cases

Reproducible builds

A hook can be used for wrapping apt-get command to use snapshot.debian.org for reproducing package versions without modifying the Dockerfile.

The /dev/.dfhook/entrypoint script can be like this:

#!/bin/bash
set -eu -o pipefail

: "${SOURCE_DATE_EPOCH:=$(stat --format=%Y /etc/apt/sources.list.d/debian.sources)}"
snapshot="$(printf "%(%Y%m%dT%H%M%SZ)T\n" "${SOURCE_DATE_EPOCH}")"
. /etc/os-release

# Rewrite /etc/apt to use snapshot.debian.org
cp -a /etc/apt /etc/apt.bak
rm -f /etc/apt/sources.list.d/debian.sources
cat <<EOF >>/etc/apt/sources.list
deb [check-valid-until=no] http://snapshot.debian.org/archive/debian/${snapshot} ${VERSION_CODENAME} main
deb [check-valid-until=no] http://snapshot.debian.org/archive/debian-security/${snapshot} ${VERSION_CODENAME}-security main
deb [check-valid-until=no] http://snapshot.debian.org/archive/debian/${snapshot} ${VERSION_CODENAME}-updates main
EOF

# Run the command
set +e
"$@"
status=$?
set -e

# Restore /etc/apt
rm -rf /etc/apt
mv /etc/apt.bak /etc/apt

exit $status

A hook may also push/pull dpkg blobs to an OCI registry (or whatever) for efficient caching.

Cross-compilation

xx-apt, etc. (https://github.com/tonistiigi/xx) can be reimplemented as a hook.

Malware detection

A hook may use seccomp, etc. to hook the syscalls and detect malicious actions, etc.

Enterprise networking

Enterprise networks often require installing a MITM proxy cert.
This can be easily automated with a hook.

FAQs

  • Q. Why not just modify Dockerfile?
    • A. Because it affects the history object in OCI Image Config and decreases reproducibility

@AkihiroSuda
Copy link
Member Author

@tonistiigi @thaJeztah PTAL 🙏

@AkihiroSuda
Copy link
Member Author

@tonistiigi Can we merge this, so that we can do more experiments on repro builds etc. ?

@AkihiroSuda
Copy link
Member Author

Rebased

@janjongboom
Copy link

@AkihiroSuda One thing I don't like about this approach is that it separates the hook logic away from the Dockerfile; and thus the logic for building the container is split. Most likely the container won't build correctly w/o the hooks (e.g. because it will use the non-snapshot package repository); but there's no indication in the Dockerfile itself. At least this would be clearer (Can nicely throw an error if hook is missing).:

RUN --hook=dfhook foo

I'd prefer it even more if it's all in the same Dockerfile:

HOOK dfhook=--mount=from=example.com/hook,target=/dev/.dfhook \
  --mount=type=secret,source=something,target=/etc/something \
  /dev/.dfhook/entrypoint $1

RUN --hook=dfhook foo

Clear and self-contained. Easy to pass to e.g. Kaniko for example etc. Although that would expand the Docker vocabulary so that sounds like a much bigger PR.

@rittneje
Copy link
Contributor

@janjongboom The fact that it isn't in the Dockerfile is exactly what makes this feature so useful. Now we can for example mount the credentials for our mirror transparently, which has two benefits:

  1. Every Dockerfile author doesn't have to copy-paste how to do that.
  2. You can write a Dockerfile that "just works" locally (where it won't use the mirror), and also "just works" on the build server.

@gfyrag
Copy link

gfyrag commented May 24, 2024

I'm in favor of it.
Not specifically the design, but a way to handle cross concern problematics outside the Dockerfile.
I'm exactly in this use case, I have in my company a big monorepo with a lot of Dockerfile (Earthfile in reality). I can't really justify why I append some env vars everywhere to be able to plug my local setup.
Something like what oci containers propose with hooks would be very useful.

@thompson-shaun thompson-shaun modified the milestones: v0.14.0, v0.future May 30, 2024
@AkihiroSuda
Copy link
Member Author

@thompson-shaun @tonistiigi Which release will contain this PR? 🙏

@AkihiroSuda
Copy link
Member Author

Rebased.

@AkihiroSuda
Copy link
Member Author

Rerebased

@AkihiroSuda
Copy link
Member Author

Rererebased

@AkihiroSuda
Copy link
Member Author

Rerererebased

@tonistiigi @thaJeztah PTAL 🙏

@AkihiroSuda
Copy link
Member Author

@thompson-shaun @tonistiigi I moved this from the v0.future milestone to v0.16.0, as this PR has been open for a long time and discussed enough in #4576, but feel free to move this back to v0.future if there is a concern

Will be used in the follow-up commits for implementing "Dockerfile hooks"
(issue 4576)

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Close issue 4576

- - -

e.g.,
```bash
buildctl build \
  --frontend dockerfile.v0 \
  --opt hook="$(cat hook.json)"
```
with `hook.json` as follows:
```json
{
  "RUN": {
    "entrypoint": ["/dev/.dfhook/entrypoint"],
    "mounts": [
      {"from": "example.com/hook", "target": "/dev/.dfhook"},
      {"type": "secret", "source": "something", "target": "/etc/something"}
    ]
  }
}
```

This will let the frontend treat `RUN foo` as:
```dockerfile
RUN \
  --mount=from=example.com/hook,target=/dev/.dfhook \
  --mount=type=secret,source=something,target=/etc/something \
  /dev/.dfhook/entrypoint foo
```

`docker history` will still show this as `RUN foo`.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

I'm withdrawing this proposal and going to implement a standalone translator that consumes Dockerfile and generate a new Dockerfile:

program-name-to-be-decided translate --hook=hook.json < Dockerfile > Dockerfile.new

The drawback of the new approach is that it can't reproduce the OCI Image Config digest as it can't retain the docker history object.
This drawback might be practically acceptable (although looks quite ugly), as https://github.com/reproducible-containers/diffoci has --ignore-history flag to allow comparing OCI Image Configs excluding the docker history object.

I'm closing #4669 but I still want the SOURCE_DATE_EPOCH PRs for DOI (docker-library/official-images#16044 (comment)) to be merged.

Originally posted by @AkihiroSuda in #4576 (comment)

@AkihiroSuda AkihiroSuda removed this from the v0.16.0 milestone Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants