Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: embed build sources in image config #2269

Closed
tonistiigi opened this issue Jul 20, 2021 · 0 comments · Fixed by #2311
Closed

Proposal: embed build sources in image config #2269

tonistiigi opened this issue Jul 20, 2021 · 0 comments · Fixed by #2311
Assignees

Comments

@tonistiigi
Copy link
Member

tonistiigi commented Jul 20, 2021

When an image has been built it is good to know what were the dependencies of the specific build. This allows figuring out if any of the dependencies have been updated and the build should be run again. Or maybe in the future this could be used as a way to pin the dependencies to a specific digest for reproducibility.

LLB has a Source operation for these cases: a container image, git commit, http URL, or local directory. Everything but the local directory can be tracked with immutable digest based only on the LLB definition.

When this immutable digest is computed in CacheMap()

CacheMap(context.Context, session.Group, int) (*CacheMap, bool, error)
, we can extend the return structure
type CacheMap struct {
with extra information that is later added to the image config. Because the solver package is generic and doesn't know about LLB/snapshots I think it should just be a string map. I don't think it makes sense to reuse the existing CacheOpts field for this (@sipsma).

ResolveResponse map[string]string

{ "container-image://docker.io/library/alpine:3.13": "sha256:deadbeef" }

When solver runs the build it already stores the CacheMap value for all the vertexes running as part of the build. Before returning CachedResult

return j.list.s.build(ctx, e)
it can walk back all the parent vertexes and gather their ResolveResponse values and combine them to a single structure that is returned out from the Build() function. The extra return value is needed because Metadata in CachedResult is not typed. Maybe it should be but that is for a different proposal.

Now this structure can be passed to the exporter. The image exporter will would add it as an extra field. As this is BuildKit specific, I think it makes sense to use similar as what we do with inline build-cache - use a single base64 encoded string with a buildkit specific name.

"moby.buildkit.buildinfo.v0": <base64>

Base64 decodes to

{ "sources": [
{
   "type": "image",
   "ref": "docker.io/library/alpine:3.13",
   "pin": "sha256:"
}, 
{
   "type": "git",
   "ref": "github.com/docker/buildx#master",
   "pin": "sha1:deadbeef"
}
]

There is one special case to take into account. A frontend might have already transformed a string user typed before generating LLB. Eg. in Dockerfile this happens for FROM images because Dockerfile needs to load their image config in the frontend in order to access env/onbuild etc. While doing that Dockerfile always adds digest to the image ref in order for the LLB solve to always point to the same image. So in LLB we already have the digest ref, but in the embedded buildinfo it would be better to show the original value.

The solution for thiss is that Dockerfile frontend can create its own moby.buildkit.buildinfo.v0 key in the image config for the values it sees and then the image exporter can fix it up after full solve. This is similar to how the history array works atm by Dockerfile adding the command strings and exporter filling up dates etc. later in patchImageConfig(). Dockerfile can add a record like:

{
   "type": "image",
   "ref": "docker.io/library/alpine:3.13",
   "alias": "docker.io/library/alpine:3.13@sha256:",
   "pin": "sha256:",
}, 

So that when now LLB adds a source for "docker.io/library/alpine:3.13@sha256:" it is fixed in exporter and alpine:3.13 is used as original ref instead.

We can start by adding this frontend component in Dockerfile and extend it to support full LLB.

I think this can be enabled by default. There shouldn't be any security aspect of having access to the source images. Mostly this information is already in the history array with textual form. But we should provide a way to opt-out with a special key in -o.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants