-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "Builder: oci-import" support #61
Conversation
5d57b56
to
e99238e
Compare
A file the shows off how I discovered that symlinks weren't working: Maintainers: Foo (@bar)
GitRepo: https://github.com/tianon/docker-debian-artifacts.git
GitFetch: refs/heads/oci-arm32v5
Architectures: arm32v5
GitCommit: f9093e68545e964f67bb3b516017d8ec2033ec71
Builder: oci-import
File: index.json
Tags: bullseye, bullseye-20221114, 11.5, 11, latest
Directory: bullseye/oci (everything under https://github.com/tianon/docker-debian-artifacts/tree/f9093e68545e964f67bb3b516017d8ec2033ec71/bullseye/oci/blobs/sha256 is symlinks to more usefully-named files so that browsing the directory via GitHub is more user friendly) |
cmd/bashbrew/cmd-build.go
Outdated
|
||
for _, tag := range r.Tags(namespace, uniq, entry) { | ||
fmt.Printf("Tagging %s\n", tag) | ||
if !dryRun { | ||
err := dockerTag(cacheTag, tag) | ||
if err != nil { | ||
return cli.NewMultiError(fmt.Errorf(`failed tagging %q as %q`, cacheTag, tag), err) | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making this part of docker build
has the side effect that if we skip docker build
(because bashbrew/cache:xxx
exists and is up-to-date) it might not get all the tags it's supposed to have. 😩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see three primary options here:
- revert this part of the change and go back to
docker build
+docker tag
in a loop - rely on
docker build
cache exclusively- noting that we added this indirection specifically because that cache was horrifyingly slow at the large scale our builders typically run
- when our bashbrew cache exists already, only then do the
docker tag
loop- on Windows, it would be faster and more reliable to
docker inspect
each one before wedocker tag
it (anddocker tag
often fails, so it'd be useful to have it auto-retry once or twice, but only on Windows 😭)
- on Windows, it would be faster and more reliable to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we also could:
- do a basic
docker build
that we pass justFROM bashbrew/cache:xxx
to abusedocker build
as a multi-docker tag
😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(To expound, a legit case where this gets hit is that we push a build, it works, it finishes, and then we add a new tag or change a tag and this now will fail pushing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assumption: docker build -t '' -t '' ...
is faster than multiple docker tag
, especially for Windows.
I think using docker build
to get multiple tags at once is nice. It could speed up builds (and maybe help windows fail less?). I'm not sure about using a single line FROM
Dockerfile; it feels bad, like something that could be inadvertently broken by buildkit in the future.
As far as number two, we know that classic docker build
cache is slow, so we can revisit this possibility once we use buildkit by default and we can try relying on its cache (I'll start testing it).
On windows we should do the inspect to see if that specific tag is already correct. 🙈
So I guess my conclusion is to probably use the hacky single line FROM
Dockerfile to do a multi-tag. If it breaks in the future we can hopefully just rely on buildkit build cache (or maybe convince docker tag
to take multiple targets).
Codecov Report
@@ Coverage Diff @@
## master #61 +/- ##
==========================================
- Coverage 80.77% 73.10% -7.67%
==========================================
Files 6 7 +1
Lines 619 714 +95
==========================================
+ Hits 500 522 +22
- Misses 94 162 +68
- Partials 25 30 +5
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
035f938
to
b2b8385
Compare
(last push was just a rebase) |
In the case of base images (`debian`, `alpine`, `ubuntu`, etc), using a `Dockerfile` as our method of ingestion doesn't really buy us very much. It made sense at the time it was implemented ("all `Dockerfile`, all the time"), but at this point they're all some variation on `FROM scratch \n ADD foo.tar.xz / \n CMD ["/bin/some-shell"]`, and cannot reasonably be "rebuilt" when their base image changes (which is one of the key functions of the official images) since they _are_ the base images in question. Functionally, consuming a tarball in this way isn't _that_ much different from consuming a raw tarball that's part of, say, an OCI image layout (https://github.com/opencontainers/image-spec/blob/v1.0.2/image-layout.md) -- it's some tarball plus some metadata about what to do with it. For less trivial images, there's a significant difference (and I'm not proposing to use this for anything beyond simple one-layer base images), but for a single layer this would be basically identical. As a more specific use case, the Debian `rootfs.tar.xz` files are currently [100% reproducible](https://github.com/debuerreotype/debuerreotype). Unfortunately, some of that gets lost when it gets imported into Docker, and thus it takes some additional effort to get from the Docker-generated rootfs back to the original debuerreotype-generated file. This adds the ability to consume an OCI image directly, to go even further and have a 100% fully reproducible image digest as well, which makes it easier to trace a given published image back to the reproducible source generated by the upstream tooling (especially if a given image is also pushed by the maintainer elsewhere). Here's an example `oci-debian` file I was using for testing this: Maintainers: Foo (@bar) GitRepo: https://github.com/tianon/docker-debian-artifacts.git GitFetch: refs/heads/oci-arm32v5 Architectures: arm32v5 GitCommit: d6ac440e7760b6b16e3d3da6f2b56736b9c10065 Builder: oci-import File: index.json Tags: bullseye, bullseye-20221114, 11.5, 11, latest Directory: bullseye/oci Tags: bullseye-slim, bullseye-20221114-slim, 11.5-slim, 11-slim Directory: bullseye/slim/oci
…line synthetic `Dockerfile`
if strings.HasPrefix(target, "../") { | ||
return "", fmt.Errorf("unsupported: %q is a relative symlink outside the tree (%q)", f.Name, target) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably should be above the path.Join
? And maybe a check on path.Clean
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be below the path.Join
because a relative symlink to a directory outside of its directory but still within the Git repository is valid and safe (it's only leaving the Git repository that isn't safe and would actually error later so this is still a best-effort check). I'll add some comments to make that more clear.
That being said, path.Clean
is a good call-out - that would make this check's implementation safer. 😅
Closes #51
In the case of base images (
debian
,alpine
,ubuntu
, etc), using aDockerfile
as our method of ingestion doesn't really buy us very much. It made sense at the time it was implemented ("allDockerfile
, all the time"), but at this point they're all some variation onFROM scratch \n ADD foo.tar.xz / \n CMD ["/bin/some-shell"]
, and cannot reasonably be "rebuilt" when their base image changes (which is one of the key functions of the official images) since they are the base images in question.Functionally, consuming a tarball in this way isn't that much different from consuming a raw tarball that's part of, say, an OCI image layout (https://github.com/opencontainers/image-spec/blob/v1.0.2/image-layout.md) -- it's some tarball plus some metadata about what to do with it.
For less trivial images, there's a significant difference (and I'm not proposing to use this for anything beyond simple one-layer base images), but for a single layer this would be basically identical.
As a more specific use case, the Debian
rootfs.tar.xz
files are currently 100% reproducible. Unfortunately, some of that gets lost when it gets imported into Docker, and thus it takes some additional effort to get from the Docker-generated rootfs back to the original debuerreotype-generated file.This adds the ability to consume an OCI image directly, to go even further and have a 100% fully reproducible image digest as well, which makes it easier to trace a given published image back to the reproducible source generated by the upstream tooling (especially if a given image is also pushed by the maintainer elsewhere).
Here's an example
oci-debian
file I was using for testing this: