distribution: separate layer and image config for v1 pushes #28
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rebase of balena-io-archive/docker#12
When content addressability was introduced in #17924, a compatibility layer for registry v1 pushes was added. When the engine is asked to push an image to a v1 registry it needs to create v1 IDs for the images.
The strategy so far has been to use the full imageID for the first v1 layer and the ChainID for all other layers, effectively creating as many v1 layers as there are in the image. Only the top most layer contained
the image configuration and the other layers had a dummy json containing only a parent reference.
This becomes problematic when the first layer of the image is big. Consinder the following two Dockerfiles:
Both of these images will have the exact same layers, with the layer created by
RUN create_very_big_file
being the topmost one, but their imageIDs will differ since they have a different CMD and therefore different image configs.When pushing to a v1 registry, the
RUN create_very_big_file
layer will be pushed twice, once with the v1 ID set to foo's imageID and once with the v1 ID set to bar's imageID. Also, any clients wanting to pull thoseimages won't realise it's the same layer and will proceed to download it twice.
This commit solves this problem by separating the layers from the image configuration information when pushing to a v1 registry. To do this, all layers of an image are pushed with their ChainIDs and a synthetic top level layer is created with its contents set to the EmptyLayer, it's config set to the image config, and its v1 ID set to the imageID. This will have the side-effect of adding one layer.
To prevent new layers being piled on top of each other forever, the code checks if the topmost layer is already an empty layer and in that case it uses that for the image configuration.