distribution: separate layer and image config for v1 pushes #28

petrosagg · 2017-10-01T16:50:50Z

Rebase of balena-io-archive/docker#12

When content addressability was introduced in #17924, a compatibility layer for registry v1 pushes was added. When the engine is asked to push an image to a v1 registry it needs to create v1 IDs for the images.

The strategy so far has been to use the full imageID for the first v1 layer and the ChainID for all other layers, effectively creating as many v1 layers as there are in the image. Only the top most layer contained
the image configuration and the other layers had a dummy json containing only a parent reference.

This becomes problematic when the first layer of the image is big. Consinder the following two Dockerfiles:

FROM busybox
RUN create_very_big_file
CMD /foo

FROM busybox
RUN create_very_big_file
CMD /bar

Both of these images will have the exact same layers, with the layer created by RUN create_very_big_file being the topmost one, but their imageIDs will differ since they have a different CMD and therefore different image configs.

When pushing to a v1 registry, the RUN create_very_big_file layer will be pushed twice, once with the v1 ID set to foo's imageID and once with the v1 ID set to bar's imageID. Also, any clients wanting to pull those
images won't realise it's the same layer and will proceed to download it twice.

This commit solves this problem by separating the layers from the image configuration information when pushing to a v1 registry. To do this, all layers of an image are pushed with their ChainIDs and a synthetic top level layer is created with its contents set to the EmptyLayer, it's config set to the image config, and its v1 ID set to the imageID. This will have the side-effect of adding one layer.

To prevent new layers being piled on top of each other forever, the code checks if the topmost layer is already an empty layer and in that case it uses that for the image configuration.

When content addressablity was introduced in #17924, a compatibility layer for registry v1 pushes was added. When the engine is asked to push an image to a v1 registry it needs to create v1 IDs for the images. The strategy so far has been to use the full imageID for the first v1 layer and the ChainID for all other layers, effectively creating as many v1 layers as there are in the image. Only the top most layer contained the image configuration and the other layers had a dummy json containing only a parent reference. This becomes problematic when the first layer of the image is big. Consinder the following two Dockerfiles: FROM busybox RUN create_very_big_file CMD /foo FROM busybox RUN create_very_big_file CMD /bar Both of these images will have the exact same layers, with the layer created by `RUN create_very_big_file` being the topmost one, but their imageIDs will differ since they have a different CMD and therefore different image configs. When pushing to a v1 registry, the `RUN create_very_big_file` layer will be pushed twice, once with the v1 ID set to foo's imageID and once with the v1 ID set to bar's imageID. Also, any clients wanting to pull those images won't realise it's the same layer and will proceed to download it twice. This commit solves this problem by separating the layers from the image configuration information when pushing to a v1 registry. To do this, all layers of an image are pushed with their ChainIDs and a synthetic top level layer is created with its contents set to the EmptyLayer, it's config set to the image config, and its v1 ID set to the imageID. This will have the side-effect of adding one layer. To prevent new layers being piled on top of each other forever, the code checks if the topmost layer is already an empty layer and in that case it uses that for the image configuration. Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

zozo123

LGTM

petrosagg requested a review from zozo123 October 1, 2017 16:50

zozo123 approved these changes Oct 1, 2017

View reviewed changes

petrosagg merged commit 970c454 into 17.06-resin Oct 1, 2017

petrosagg deleted the fix-registry-v1-push branch October 1, 2017 18:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

distribution: separate layer and image config for v1 pushes #28

distribution: separate layer and image config for v1 pushes #28

petrosagg commented Oct 1, 2017

zozo123 left a comment

distribution: separate layer and image config for v1 pushes #28

distribution: separate layer and image config for v1 pushes #28

Conversation

petrosagg commented Oct 1, 2017

Rebase of balena-io-archive/docker#12

zozo123 left a comment

Choose a reason for hiding this comment