Layer relations/parent layer clarification #190

sgotti · 2016-08-02T12:29:41Z

Perhaps this is to some extents related to #39 (but it was closed and #102 talks about something different) but, just to dispel any doubt, I'd like to be sure that there's no relation between OCI image layers and so, that a layer (if not a base one) is not forced to always have the same parent layer.

To be clear, does the spec permits defining two images where the upper layer has a different bottom layer in the chain?

Image A manifest:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": ...,
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.serialization.rootfs.tar.gzip",
      "size": 32654,
      "digest": "sha256:e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f"
    },
    {
      "mediaType": "application/vnd.oci.image.serialization.rootfs.tar.gzip",
      "size": 73109,
      "digest": "sha256:ec4b8955958665577945c89419d1af06b5f7636b4ac3da7f12184802ad867736"
    }
  ]
}

Image B manifest:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": ...,
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.serialization.rootfs.tar.gzip",
      "size": 16724,
      "digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b"
    },
    {
      "mediaType": "application/vnd.oci.image.serialization.rootfs.tar.gzip",
      "size": 73109,
      "digest": "sha256:ec4b8955958665577945c89419d1af06b5f7636b4ac3da7f12184802ad867736"
    }
  ]
}

(for simplicity I'm skipping the images config since I hope the manifest will be enough)

For example someone would like to just add an application on top of a set of common libraries (or just a base linux distribution image) and then upgrade the base libraries without rebuilding the "application" layer.

I'm asking this since the OCI image spec uses as a starting point the docker v2.2 distribution manifest and also:

reading this talks about a layer having just one parent.
thinking on the docker workflow, you will build an image starting from a (if not a base) a parent image
looking at the history section of the image serialization config object (https://github.com/opencontainers/image-spec/blob/master/serialization.md#image-json-description mediaType: application/vnd.oci.image.serialization.config.v1+json)
thinking on how (but I can be wrong) docker graph work.

My opinion is that there's no reason and there's nothing in the spec that says that a layer has a relation with another layer and that a layer (if not a base one) will always have the same parent layer.

The text was updated successfully, but these errors were encountered:

philips · 2016-08-10T17:00:46Z

I see no reason why this wouldn't be allowed.

philips · 2016-08-10T17:00:56Z

cc @opencontainers/image-spec-maintainers

stevvooe · 2016-08-10T20:59:28Z

This is more of a build time property than something we want to build into the specification. However, this requirement is clearly called out with the existence of a DiffID and ChainID. In fact, there are a number of security issues with treating a layer independently from the parent. Without these, your container system will be open to a number of exploits that can allow the injection of malicious code.

From a manifest perspective, this doesn't matter. The manifest just describes resources and provides a rough ordering for ideally resource fetch. The client shouldn't need to know anything about how they are assembled or even what the resources really are. They should just fetch the resource and dispatch them to a handler. It can be said that manifests are agnostic to the format.

The image configuration tells the story about how to assemble these into something usable. That includes the relationships between layers. If you want to switch out a layer, you'll have to fix up these identifiers to comply with these relationships such that they can be verified.

From the perspective of the specification, these are really two different images, which happen to share common resources. If you read through Creating an image filesystem changeset, you'll see why this interdependency is important. If the new layer wasn't built considering the resources in the old layer, it is easy to unintentionally expose extra or malicious data. In practice, there isn't a lot of expense to this, but it will result in a new layer.

From a high-level, layers aren't the right place to share common resources for this style of application. That is not to say that containers built on the same layer can't share that layer. This is more to say that this style of composition needs to happen at the container runtime, where these relationships can be expressed through naming, rather than content address. Shoehorning this functionality at this level is just going to lead to broken stuff. There are simply too many things that can go wrong when you switch out the base layer without re-building and testing the application layer.

That said, this style of composition fits in very well at the build level, where these components can be assembled, packaged and verified together, resulting an immutable artifact. At that stage, names can be used to reference changing artifacts that reflect actual, build time dependencies. By forcing that to be done at build time, you centralize the update, leading to a more secure, more reliable assembly that relies on existing packaging systems that already solve these dependency problems for us.

sgotti · 2016-08-23T16:23:42Z

@stevvooe thanks for your detailed answer!

I can agree that this cannot be the correct way for doing this but I just tried to find a simple example to explain the question 😄

This is more of a build time property than something we want to build into the specification. However, this requirement is clearly called out with the existence of a DiffID and ChainID.

If you want to switch out a layer, you'll have to fix up these identifiers to comply with these relationships such that they can be verified.

So, let's say that someone wants (ignoring all the warnings) to follow this road and create a build tool that generates the correct DiffID and ChainID (is this possible or am I missing something?). Since the spec doesn't blocks this an oci image implementation (local store, registry) should also handle this case (a layer with different parent layers).
But this, currently, if I'm not wrong, will cause issues in the docker graph drivers and on some registries that are assuming that a layer can have only the same parent layer.

I'm not sure where's the line. Is the implementation that isn't image spec complaint or is the image spec not clear on how an implementation should manage layers?

BTW, the appc spec has the concept of dependencies between images (it doesn't have the layer concept), and, for the same top image, its dependencies may change (if not forced by its digest) since discovery is used to locate them. And, in the end, these images are rendered on disk (in a bit more complex way since its a DAG and not just a chain) just extracting the images in the DAG in the correct order and applying witheouts (PathWhiteList in the appc case) on them.

stevvooe · 2016-08-24T19:56:10Z

So, let's say that someone wants (ignoring all the warnings) to follow this road and create a build tool that generates the correct DiffID and ChainID (is this possible or am I missing something?).

I'm considering this to be a "build time" operation.

From the perspective of the OCI specification, the result of this modification would be a separate image.

But this, currently, if I'm not wrong, will cause issues in the docker graph drivers and on some registries that are assuming that a layer can have only the same parent layer.

No. I'm not sure if I'm making my point accurately. The issue is that a layer may have opaque files that mean nothing when applied to an arbitrary parent.

BTW, the appc spec has the concept of dependencies between images (it doesn't have the layer concept), and, for the same top image, its dependencies may change (if not forced by its digest) since discovery is used to locate them. And, in the end, these images are rendered on disk (in a bit more complex way since its a DAG and not just a chain) just extracting the images in the DAG in the correct order and applying witheouts (PathWhiteList in the appc case) on them.

Yes, and this is effectively the same feature in docker and OCI. The difference is that we only point at the parent layer, not the image. The layer is just a tar file and the image is the configuration+layer parent chain.

vbatts · 2016-08-30T14:10:55Z

There ought to be no issues with pointing to an image rather than later,
but the reconciliation of the configuration of the parent (ignore?, merge?,
Something else?)

On Wed, Aug 24, 2016, 15:56 Stephen Day notifications@github.com wrote:

So, let's say that someone wants (ignoring all the warnings) to follow
this road and create a build tool that generates the correct DiffID and
ChainID (is this possible or am I missing something?).

I'm considering this to be a "build time" operation.

From the perspective of the OCI specification, the result of this
modification would be a separate image.

But this, currently, if I'm not wrong, will cause issues in the docker
graph drivers and on some registries that are assuming that a layer can
have only the same parent layer.

No. I'm not sure if I'm making my point accurately. The issue is that a
layer may have opaque files that mean nothing when applied to an arbitrary
parent.

BTW, the appc spec has the concept of dependencies between images (it
doesn't have the layer concept), and, for the same top image, its
dependencies may change (if not forced by its digest) since discovery is
used to locate them. And, in the end, these images are rendered on disk (in
a bit more complex way since its a DAG and not just a chain) just
extracting the images in the DAG in the correct order and applying
witheouts (PathWhiteList in the appc case) on them.

Yes, and this is effectively the same feature in docker and OCI. The
difference is that we only point at the parent layer, not the image.
The layer is just a tar file and the image is the configuration+layer
parent chain.

—
You are receiving this because you are on a team that was mentioned.

Reply to this email directly, view it on GitHub
#190 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEF6W8BjeFZkIiNDLtiZkcaY1oya003ks5qjKHbgaJpZM4JajhW
.

philips · 2016-08-30T16:40:03Z

@vbatts "rather than later". Having a hard time parsing your response.

wking · 2016-08-30T17:40:52Z

On Tue, Aug 30, 2016 at 09:40:09AM -0700, Brandon Philips wrote:

@vbatts "rather than later". Having a hard time parsing your response.

I'm pretty sure he meant “rather than a layer”.

vbatts · 2016-08-30T19:30:27Z

s/later/layer/

stevvooe · 2016-08-30T20:52:31Z

@vbatts Theoretically, I agree. Could you show an example?

vbatts · 2016-09-07T19:03:08Z

trivial example, but referencing object sha256:702ad90f705365227e902b42d91dd1a40e48ca7f67a2f4b2fd052aaa4295cd95, which is provided by https://storage.googleapis.com/golang/go1.7.linux-amd64.tar.gz
By having a child layer be this reference, after applying the above archive, there is now /go/... in the resulting filesystem.

So a application/vnd.oci.image.manifest.v1+json object that could look like:

{
    "annotations": null,
    "config": {
        "digest": "sha256:2b8fd9751c4c0f5dd266fcae00707e67a2545ef34f9a29354585f93dac906749",
        "mediaType": "application/vnd.oci.image.serialization.config.v1+json",
        "size": 1459
    },
    "layers": [
        {
            "digest": "sha256:702ad90f705365227e902b42d91dd1a40e48ca7f67a2f4b2fd052aaa4295cd95",
            "mediaType": "application/vnd.oci.image.layer.tar+gzip",
            "size": 81573766
        },
        {
            "digest": "sha256:8ddc19f16526912237dd8af81971d5e4dd0587907234be2b83e249518d5b673f",
            "mediaType": "application/vnd.oci.image.layer.tar+gzip",
            "size": 667590
        }
    ],
    "mediaType": "application/vnd.oci.image.manifest.v1+json",
    "schemaVersion": 2
}

vbatts · 2016-09-07T19:04:56Z

(or perhaps with application/tar+gzip mimetype, but it could be applied as application/vnd.oci.image.layer.tar+gzip)

stevvooe · 2016-09-07T19:42:36Z

@vbatts Wouldn't the example call for a application/vnd.oci.image.manifest.v1+json as one of the layers? The specification already should handle the case that you are talking about.

vbatts · 2016-09-07T19:52:14Z

@stevvooe i don't follow why a manifest would be one of the layers. Elaborate?

wking · 2016-09-07T20:04:15Z

On Wed, Sep 07, 2016 at 12:52:15PM -0700, Vincent Batts wrote:

@stevvooe i don't follow why a manifest would be one of the
layers. Elaborate?

My reading of 1 was that you were suggesting:

"layers": [
    {
        "digest": "sha256:abc…"
        "mediaType": "application/vnd.oci.image.manifest.v1+json"
        …
    },
    {
        "digest": "sha256:def…"
        "mediaType": "application/vnd.oci.image.layer.tar+gzip",
        …
    }
],

which would have the same effect as a manifest which replaced the
sha256:abc… layer with all the layers contained in the sha256:abc…
manifest (recursively if that manifest in turn referenced other
manifests).

You could also require image-authors to flatten the layers array out
and not allow application/vnd.oci.image.manifest.v1+json layer
entries, but that makes “I just want to stick something small on top
of the image you already trust” less obvious. Still, putting a
reference to sha256:abc… in annotations would accomplish the same
goal, and keep the layers spec simpler, so I don't feel strongly
either way.

stevvooe · 2016-09-07T21:36:02Z

@vbatts I'm not suggesting that, but that seemed to be the request here. My point, under that premise, is that it is odd to place an image in the layers and this should really be a build time fixup.

vbatts · 2016-09-13T19:54:17Z

Oh I see now. Yeah. Referencing an object that is a manifest with it's own objects too. That seems like a valuable use-case.

But how would that be a build time fixup tho?

stevvooe · 2016-09-13T20:27:33Z

@vbatts My point is that you can't just swap these without fixing up the chain ids and diff ids to correlate. These are generally build time concerns.

Effectively, I'm saying this is already supported without referencing an image manifest as a layer. Adding this will just create another way to do the same thing without providing much value.

vbatts · 2016-10-06T17:38:34Z

@sgotti have we confirmed that this is a non-issue?

stevvooe · 2016-10-19T18:35:50Z

Closing after two weeks with no activity. Please re-open if there is more to discuss.

sgotti mentioned this issue Aug 5, 2016

rkt image rm doesn't remove the rendered image in treestore rkt/rkt#2890

Open

stevvooe closed this as completed Oct 19, 2016

wking mentioned this issue Oct 28, 2016

File Lineage Support Layered Media #424

Closed

wking mentioned this issue Dec 22, 2016

combinational image #508

Closed

xiekeyang mentioned this issue Jun 16, 2017

Allow layer changeset applied without running container #698

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Layer relations/parent layer clarification #190

Layer relations/parent layer clarification #190

sgotti commented Aug 2, 2016 •

edited

Loading

philips commented Aug 10, 2016

philips commented Aug 10, 2016

stevvooe commented Aug 10, 2016

sgotti commented Aug 23, 2016 •

edited

Loading

stevvooe commented Aug 24, 2016

vbatts commented Aug 30, 2016

philips commented Aug 30, 2016

wking commented Aug 30, 2016

vbatts commented Aug 30, 2016

stevvooe commented Aug 30, 2016

vbatts commented Sep 7, 2016

vbatts commented Sep 7, 2016

stevvooe commented Sep 7, 2016

vbatts commented Sep 7, 2016

wking commented Sep 7, 2016

stevvooe commented Sep 7, 2016

vbatts commented Sep 13, 2016

stevvooe commented Sep 13, 2016

vbatts commented Oct 6, 2016

stevvooe commented Oct 19, 2016

Layer relations/parent layer clarification #190

Layer relations/parent layer clarification #190

Comments

sgotti commented Aug 2, 2016 • edited Loading

philips commented Aug 10, 2016

philips commented Aug 10, 2016

stevvooe commented Aug 10, 2016

sgotti commented Aug 23, 2016 • edited Loading

stevvooe commented Aug 24, 2016

vbatts commented Aug 30, 2016

philips commented Aug 30, 2016

wking commented Aug 30, 2016

vbatts commented Aug 30, 2016

stevvooe commented Aug 30, 2016

vbatts commented Sep 7, 2016

vbatts commented Sep 7, 2016

stevvooe commented Sep 7, 2016

vbatts commented Sep 7, 2016

wking commented Sep 7, 2016

stevvooe commented Sep 7, 2016

vbatts commented Sep 13, 2016

stevvooe commented Sep 13, 2016

vbatts commented Oct 6, 2016

stevvooe commented Oct 19, 2016

sgotti commented Aug 2, 2016 •

edited

Loading

sgotti commented Aug 23, 2016 •

edited

Loading