Proposal: embed build sources in image config #2269

tonistiigi · 2021-07-20T01:48:14Z

When an image has been built it is good to know what were the dependencies of the specific build. This allows figuring out if any of the dependencies have been updated and the build should be run again. Or maybe in the future this could be used as a way to pin the dependencies to a specific digest for reproducibility.

LLB has a Source operation for these cases: a container image, git commit, http URL, or local directory. Everything but the local directory can be tracked with immutable digest based only on the LLB definition.

When this immutable digest is computed in CacheMap()

buildkit/solver/types.go

Line 147 in 1879325

CacheMap(context.Context, session.Group, int) (*CacheMap, bool, error)

, we can extend the return structure

buildkit/solver/types.go

Line 160 in 1879325

type CacheMap struct {

with extra information that is later added to the image config. Because the solver package is generic and doesn't know about LLB/snapshots I think it should just be a string map. I don't think it makes sense to reuse the existing CacheOpts field for this (@sipsma).

ResolveResponse map[string]string

{ "container-image://docker.io/library/alpine:3.13": "sha256:deadbeef" }

When solver runs the build it already stores the CacheMap value for all the vertexes running as part of the build. Before returning CachedResult

buildkit/solver/jobs.go

Line 508 in 1879325

return j.list.s.build(ctx, e)

it can walk back all the parent vertexes and gather their ResolveResponse values and combine them to a single structure that is returned out from the Build() function. The extra return value is needed because Metadata in CachedResult is not typed. Maybe it should be but that is for a different proposal.

Now this structure can be passed to the exporter. The image exporter will would add it as an extra field. As this is BuildKit specific, I think it makes sense to use similar as what we do with inline build-cache - use a single base64 encoded string with a buildkit specific name.

"moby.buildkit.buildinfo.v0": <base64>

Base64 decodes to

{ "sources": [
{
   "type": "image",
   "ref": "docker.io/library/alpine:3.13",
   "pin": "sha256:"
}, 
{
   "type": "git",
   "ref": "github.com/docker/buildx#master",
   "pin": "sha1:deadbeef"
}
]

There is one special case to take into account. A frontend might have already transformed a string user typed before generating LLB. Eg. in Dockerfile this happens for FROM images because Dockerfile needs to load their image config in the frontend in order to access env/onbuild etc. While doing that Dockerfile always adds digest to the image ref in order for the LLB solve to always point to the same image. So in LLB we already have the digest ref, but in the embedded buildinfo it would be better to show the original value.

The solution for thiss is that Dockerfile frontend can create its own moby.buildkit.buildinfo.v0 key in the image config for the values it sees and then the image exporter can fix it up after full solve. This is similar to how the history array works atm by Dockerfile adding the command strings and exporter filling up dates etc. later in patchImageConfig(). Dockerfile can add a record like:

{
   "type": "image",
   "ref": "docker.io/library/alpine:3.13",
   "alias": "docker.io/library/alpine:3.13@sha256:",
   "pin": "sha256:",
},

So that when now LLB adds a source for "docker.io/library/alpine:3.13@sha256:" it is fixed in exporter and alpine:3.13 is used as original ref instead.

We can start by adding this frontend component in Dockerfile and extend it to support full LLB.

I think this can be enabled by default. There shouldn't be any security aspect of having access to the source images. Mostly this information is already in the history array with textual form. But we should provide a way to opt-out with a special key in -o.

The text was updated successfully, but these errors were encountered:

tonistiigi added the kind/enhancement label Jul 20, 2021

tonistiigi assigned crazy-max Jul 20, 2021

crazy-max mentioned this issue Jul 20, 2021

moby/buildkit#2269 crazy-max/buildkit#3

Closed

chris13524 mentioned this issue Jul 21, 2021

My schema'd dockerfile fails parsing GoogleContainerTools/skaffold#5907

Open

crazy-max mentioned this issue Aug 15, 2021

Generate and embed build sources #2311

Merged

4 tasks

tonistiigi closed this as completed in #2311 Sep 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: embed build sources in image config #2269

Proposal: embed build sources in image config #2269

tonistiigi commented Jul 20, 2021 •

edited by crazy-max

Loading

Proposal: embed build sources in image config #2269

Proposal: embed build sources in image config #2269

Comments

tonistiigi commented Jul 20, 2021 • edited by crazy-max Loading

tonistiigi commented Jul 20, 2021 •

edited by crazy-max

Loading