-
Notifications
You must be signed in to change notification settings - Fork 670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define the data field #826
Conversation
Can you elaborate on why we're saying this should have a specific definition, such as an embedded reference of the content? It sounds like you're suggesting it should be duplicative of the blob content. As opposed to it being whatever value an author decides, related to the type of content ( I'd also suggest we need more guidance around how content authors should view layers/blobs, config and manifest values. |
Yes, it should match exactly the content being described by the descriptor.
Right, this is the purpose of the In theory, you could just put this in an annotation, but we want to be able to use this anywhere a descriptor is used, and we want to have strong semantics around how this should be interpreted.
I will add update the PR to make it clear that this is not arbitrary data, but precisely the data being referenced by the descriptor. |
I don't see the value of duplicating the content referenced in the blob. It sounds like you're trying to short circuit the lookups of blobs. While I agree, we shouldn't force clients to make too many round trips, I don't see how duplicating the content is a valid requirement. |
This isn't a requirement, it's an OPTIONAL field that artifact creators can use to avoid roundtrips for small content.
We don't necessarily need to duplicate the content. If non-distributable layers had been implemented in a reusable way (see #791 (comment) and #791 (comment)), we could take advantage of that functionality to only embed the data and never upload it as a blob. This is a separate problem that I'd like to fix eventually.
This is exactly what I'm proposing, so I don't understand the objection. Can you point to the specific language you're objecting to? |
Feedback on the call was portability concerns for registries that can't or won't store manifests over a certain size. I've added some language around that. |
specs-go/v1/descriptor.go
Outdated
@@ -35,6 +35,9 @@ type Descriptor struct { | |||
// Annotations contains arbitrary metadata relating to the targeted content. | |||
Annotations map[string]string `json:"annotations,omitempty"` | |||
|
|||
// Data is an embedding of the targeted content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want a little bit more here about how its encoded (automatically through json package, etc.) in base64. A fetch can be avoided if the data field is present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed another commit with more godoc, please take another look.
I've been looking for a way to optimize the usage of empty config fields, right now I have to push zero-byte blobs to make it work. This improvement would fit the bill for me, I don't mind encoding the field to mean a zero-byte config with the corresponding base64-encoded zero-byte blob in the data field. However, this made me wonder: what's the proper encoding for zero bytes in base64? Apparently it's an empty string, if this old answer from stack overflow is to be believed. Will an empty string in the data field potentially cause issues with some implementations? Should the spec clearly state that it should be supported? |
Strong +1 as well. This would improve the performance of github.com/sigstore/cosign considerably! |
Great question.
We may just want to add some language to distribution-spec or image-spec for this special case? It seems like clients should just not upload anything that is zero bytes? And registries shouldn't expect anything that's zero bytes? Or we could have special considerations for registries that |
@jonjohnsonjr I don't think we want to add too much if/else checks in everybody's implementation unless they want to, such that you can have a correct implementation without necessarily having to implement all of the optimizations that can be done. Rather than use special values for the digest field, let's just leave it exactly like it is: the digest of the data. The sha256 sum of zero bytes is "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", which should be enough to tell you that it's empty if you check for this constant, but there's still the size field that will say 0 bytes anyway, and can serve as a much more reliable indicator that you have a zero-byte blob. I say your proposal for the data field is sufficient for the improvement I am looking for, as long as implementations correctly accept empty blobs. They already do when you push a blob with digest sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 anyway, we just need to make sure they accept it through the data field with the base64-encoded zero-byte blob that would result in an empty string (non-null, but empty). |
Is there some specific language that you think would help with this? When fetching, clients should be able to discern the difference a nil Do we want to say something like this?
|
@jonjohnsonjr for the sake of consistency, don't necessarily add support for nil data, but really just specify that a zero-length data field MUST be supported, and that base64 encoding of zero bytes is an empty string. Just so people think about not adding a check to consider an empty (non-null, but empty) as an error condition when it is not. If you can push zero-byte blobs, you should be able to "push" it inside the data field as well by correctly encoding it as a normal blob, except this blob is empty. |
@jonjohnsonjr just to clarify, the way I understand it, if you pass the data field that corresponds to a zero-byte blob (empty string in base64 encoding) along with size 0 and the correct digest, then you don't exactly skip pushing the blob, you only did it in a much more efficient way, and that's really the only thing that matters here. |
I see, this is interesting. We could consider a push with inline data to be a valid "blob" upload. Registries could allow you to later fetch that content directly as a blob, or they could assume clients would never request that content because it was only ever uploaded in an embedded form. |
@jonjohnsonjr hehe, who would have guessed that this changed was anything but innocent? I'm glad I explained my understanding of what it meant more in depth, because clearly we didn't see it the same way. I saw it as an "inline blob push", which would give exactly the same result as pushing the blobs out-of-band, but it would make the push much faster. Clients could easily skip pulling zero-byte blobs, but nothing would prevent them from pulling them. Now the real question is: if you push a blob inline, should it remain encoded inline in the manifest after? I'm still new to the specs, but my understanding is that the manifests is the only mutable part of the registry: blobs are immutable by definition because they are content addressed (the name is the digest, and the content needs to match the digest, so immutable). In current implementations, are registry server obligated to send back exactly the same manifest that was pushed by a client? By this I mean: if you push your JSON manifest encoded with tabs, can the registry server re-encode it using spaces, or make any modifications that shouldn't affect the meaning of the manifest contents? Since there's no hash for the manifest, in theory it would be possible. Now I wonder if an inline push should result in storing a manifest in which the inline blobs are still there or removed? In order words: inline push, out-of-band pulls? Or inline push = inline pull? This may create some precedent for a small amount of data duplication if there are no strict rules to apply here. |
@jonjohnsonjr @SteveLasker so after last night's twitter discussion, I now understand that manifests are hashed and immutable, where the tags are the only mutable part of the system. This means that a manifest with inline blobs needs to be stored as-is, and cannot be modified by the server. Does this mean that some clients may decide to make a large image index manifest with all of the corresponding image manifests embedded into it? I'm fine with using larger manifests as it definitely saves a lot of extra network calls (less latency), but I just thought I should point it out. Is there a size limit for manifests, and if so, how would clients know about it and handle it gracefully? |
Yup! How large that manifest can be is basically a registry decision. Most registries already have some kind of limit here, but the exact size would be up the implementation to decide. |
There's not a specific size limit, but most registries will reject requests over a certain size. Luckily for us, this is not a new problem. You will run into this today if you try to push an image with a huge number of layers or with a lot of annotations. |
@dlorenc @jonjohnsonjr is there a part of the spec that would cover configuration discovery for a registry? something like .well-known/oci-registry-configuration that returns a JSON with all sorts of details about the registry, like the maximum manifest size, for instance. As there are more and more extensions that could be supported by a registry, how will clients correctly discover what can be used with registry XYZ? |
I don't believe so. The closest we've gotten to adding something like this is with a proposal for extensions: opencontainers/distribution-spec#111 There has also been a lot of flexibility in implementation historically, so I don't think it's even possible to know all the kinds of limitations that exist. They tend to be based on storage decisions and can change over time... This would be something interesting to document somehow, maybe as part of the distribution conformance tests? |
Having a way to include some data in the manifest can definitely help. I think it's fair to say all large registries cache manifests, as users tend to ping registries to death, asking for changes to a tag, to know if they should pull again. I'm not saying it's a good pattern, rather the reality of what users do. To this end, if we don't cap this, in the spec, users will put what they consider "small content" in the manifest. But, what defines small? (256 bytes, 1k, 1mb, 10mb, 1gb). The point of a spec is to drive consistency across implementations, so users can promote content within and across registries. Not specifying a specific value for something we know will "get abused" is a guardrail I'd ask us to qualify, so our users have consistent results. It would help to identify that canonical use cases, and come to an agreement for what defines small for this version of the spec.
This is the one I'm not clear on. Are you saying the The questions from @awakecoding surface the potential confusion on how this could be used. This |
Docker Hub's rate limits may have helped fix tooling that was doing too many pulls on the manifests. Users can still do lots of polling on the registry to see if anything has changed, but they should do that with a head request instead of a get. Which hopefully means a cache could be made that only tracks the current digest rather than the full manifest. |
@SteveLasker @sudo-bmitch HEAD requests on the manifests are obviously the best way to check if locally cached manifests match the remote manifest and avoid pulling the contents when not needed. I noticed that both the HTTP Etag and Docker-Content-Digest headers are currently set to the digest of the manifest, following the same digest format used for the blobs: https://github.com/opencontainers/distribution-spec/blob/main/spec.md#pulling-manifests I was wondering if there are plans for using a header that doesn't include Docker in the name for the OCI spec? Etag are great, but they have one major limitation: the value is opaque. Even if you decide to use a digest as the Etag value, there's no guarantee that all implementations will use it that way. In another project I have define my own Etag variant using a multibase-encoded multihash as the value. We don't need to use the same, the current Digest string format used for the blobs is good enough. I assume Docker-Content-Digest is good for that, except maybe it still refers too much to Docker in the name? If we can somehow define that the Etag-like header always contains a digest-based value, then you can always reconstruct the value by rehashing the cached manifest file. This avoids the need for storing associative Etag values that are not digest-based and heavily simplifies the implementation complexity. It also makes it easier to store the manifests in the blob storage mechanism. |
As described, this would help simplify and improve performance of several tools I maintain, including http://github.com/tektoncd/chains and https://github.com/sigstore/cosign. Big +1 from me! |
@SteveLasker just to clarify, I do not object to the current proposal, and we should take the Etag discussion to another thread. My only remaining concern is how to deal nicely with manifest size limits, but again, we can also start a new discussion after this one. Is there anything that prevents merging the current proposal as-is at this point? Sorry for hijacking the thread a little bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reiterating my LGTM here.
While I do understand the concerns, most seem to be at the registry-level. We can definitely recommend size limits on the manifest as a whole in distribution-spec, which should provide back pressure on the use of the data
field.
There might be uses that we haven't thought through. For example, this enables packing an entire image up in a single json blob, typically reserved for the image layout. Whether that is a "good" idea or not isn't that important but the concept of data
on a descriptor internally sound.
As far as direct immediate benefit is concerns, we can eliminate a round trip to get the config immediately by embedding it in the manifest (https://github.com/opencontainers/image-spec/blob/main/specs-go/v1/manifest.go#L25).
@cyphar PTAL 😄 |
I'm mulling on this. Really. Since this is core descriptor definition is pulled in across image config, manifest, and index, then it means that potentially:
This is a cascading effect, which could be argued as a benefit or a drawback. It's duplicating data stored on the registry for the sake of reducing round trips (which arguably http/2 should reduce to minimal concern). The drawback on a registry just setting a size limit would likely fail in an ugly way. And what if a registry just silently dropped this |
I'd consider it a benefit (@sudo-bmitch needs this, IIUC).
We're already duplicating tons of data that's stored in the registry because we're using tarballs for layers. I'll also say it's not really our place to say whether or not folks should be duplicating data. If there's a cost to the registry to store slightly larger manifests, the registry can pass that cost onto their customers.
The benefits aren't constricted to just http -- this can reduce roundtrips to disk as well. What aspect of http2 helps here? Server push? I can kind of see that, but relying on that makes registries a little less dumb. Also, given that registries usually redirect to blob storage, I don't think http2 can do much :/
I think this is a distribution-spec concern that we can solve in opencontainers/distribution-spec#293
I don't think this is a real concern given that clients expect to be able to pull manifests by digest. This kind of thing could only happen if a registry isn't storing/serving the raw manifest bytes that were uploaded, which would immediately blow up the first time a client used different whitespace/indentation/sorting from the registry. |
I think this can be an advantage if we start creating deeply nested small objects. E.g. we may submit a request for all signatures on an image, which returns a list of descriptors. That list could contain the referenced manifest, and even the embedded signature if it's small enough, turning 3 round trips into 1. When it turns into a drawback, we have the fall back option of not including the data field when we build and push the parent objects. If you have a multi-platform set of manifests, each embedding the config as data along with other fields, then the index manifest could avoid embedding the data field since it may make that index too large.
I think failing the manifest push if it's over that registry's size limit is the best of the bad options, and motivation to clients to avoid abusing the field. It's also a problem we'll have even without standardizing this field (it could even be worse). Modifying the manifest in a way that changes the digest should never be allowed by the registry. (Do we do that anywhere other than falling back from schema2 to schema1 for ancient docker clients?) Clients are also motivated to avoid abusing this since it's faster to pull a separate blob beyond a threshold. The base64 encoding adds overhead (roughly 33%) so you need to be sure that bandwidth overhead is less than the sum of the normal the round trip latency overhead plus http request/response bandwidth overhead. For a similar reason, if I did this for multi-platform manifest, I'd probably only do it for the linux/amd64 descriptor since the others are pulled so rarely those data fields would be unneeded overhead 95% of the time. |
On 20/08/21 14:23 -0700, jonjohnsonjr wrote:
> This is a cascading effect, which could be argued as a benefit or a drawback.
I'd consider it a benefit ***@***.*** needs this, IIUC).
Fair, but imagine the workflow of pushing many small chunks of a file
system. As if I used restic or casync to chunk up a rootfs.
No _one_ of these hits the limit, for each descriptor, but now the image
manifest is enormous.
It's clever, but not tunable/discoverable by clients. They would just
have to go through the whole push of objects, then get a fail at the end
as that rolls up. Then the server has stuff to cleanup.
It would be simpler to begin with to isolate a place that the descriptor data structure is used, that is okay for the `data` field to be allowed.
Just like how the image-index takes the base definition of a descriptor
and extends it.
(Note to future-self: the descriptor really seems to be more of a
distribution-spec property than _image_ specific 🤷)
We're already duplicating _tons_ of data that's stored in the registry because we're using tarballs for layers. I'll also say it's not really our place to say whether or not folks should be duplicating data. If there's a cost to the registry to store slightly larger manifests, the registry can pass that cost onto their customers.
The human in me cringes at just passing the cost on to customers as a
spec definition.
What aspect of http2 helps here? Server push? I can kind of see that, but relying on that makes registries a little less dumb. Also, given that registries usually redirect to blob storage, I don't think http2 can do much :/
I was thinking of the active socket, without the renegotiation time of
multiple round-trips. Would need someone more hands-on that area than
myself to confirm that thought/hope.
> And what if a registry just silently dropped this data field from each digest? that would result in a wholly different digest for the resulting container. 😬
I don't think this is a real concern given that clients expect to be able to pull manifests by digest. This kind of thing could only happen if a registry isn't storing/serving the raw manifest bytes that were uploaded, which would immediately blow up the first time a client used different whitespace/indentation/sorting from the registry.
but signatures?
|
I would expect a client that is sophisticated enough to do this would realize that it doesn't make sense to arbitrarily inline chunks. I could imagine inlining e.g. the casync chunk index, but all the chunks?
What do you mean? I guess I can imagine a really naive implementation to just inline anything under some threshold, but I don't see that happening, at least not in widely used clients. As long as clients understand how to read this field, I'm happy. I would expect the inlining of data to be a very intentional thing.
I don't think that's a great comparison. Platforms only make sense if the targeted content is platform-specific, e.g. in an image index. The We would eventually want all clients to understand how the
I'm not suggesting we define this in the spec, but that if registries have problems with supporting this, they can discourage it via errors and/or capitalism.
Not if you consider the image layout.
I'm not sure what you mean here. I think we might be talking past each other, so forgive me for being super explicit: Everything that a client uploads to a registry is content-addressable. If a registry were to change anything about a blob or manifest that a client uploads, that registry is broken. I should get back from the registry exactly what I uploaded to it. I don't see any opportunity for a modern registry to silently drop a field without modern clients throwing a fit. There are two notable exceptions to this, which I think everyone regrets in hindsight:
|
haha this is feeling more like a mailing-list correspondance
On 23/08/21 12:48 -0700, jonjohnsonjr wrote:
I would expect a client that is sophisticated enough to do this would realize that it doesn't make sense to arbitrarily inline chunks. I could imagine inlining e.g. the casync chunk index, but all the chunks?
Again, fair, but this is the kind of guidance and SHOULD to help sculpt
sane implementations.
What do you mean? I guess I can imagine a really naive implementation to just inline anything under some threshold, but I don't see that happening, at least not in widely used clients. As long as clients understand how to _read_ this field, I'm happy. I would expect the inlining of data to be a very intentional thing.
Discoverable, as in "does this registry support _pushing_ inlined data?"
or "what's this registry's size limit?"
Like the feature flag discussion, maybe.
I'm not suggesting we define this in the spec, but that if registries have problems with supporting this, they can discourage it via errors and/or capitalism.
ETOOCAPITALISM, i'm game
Not if you consider the [image layout](https://github.com/opencontainers/image-spec/blob/main/image-layout.md).
right fair. I mean that it's kind of a generic structure reused across
OCI specs, not exclusive to image-spec.
Everything that a client uploads to a registry is content-addressable. If a registry were to change anything about a blob or manifest that a client uploads, that registry is broken. I should get back from the registry _exactly_ what I uploaded to it. I don't see any opportunity for a modern registry to silently drop a field without modern clients throwing a fit.
right. Good.
in agreement there.
|
@jonboulle @cyphar any strong feelings here? |
to summarize my current gut feeling on this, since last discussion splintered into hairs: I'm not convinced that this inlined data actually buys us anything |
Is there anything concrete that would convince you? Since this is already effectively widely supported by production registries (#826 (comment)), we should be able to demonstrate that a client that takes advantage of If the concern is what impact accepting/storing larger manifests would have on registries, I don't think we can assess that broadly from outside those implementations. But any concerns that might be raised from that would probably be addressed by opencontainers/distribution-spec#260, right? If myregistry.biz doesn't want to accept manifests over X MB, I can enforce that today, and should. |
The concerns of stability across registries is the primary concern as captured above |
@jonjohnsonjr we may want to tag this as a hairball and close it for now 😬 |
I think it's mostly blocked on active maintainers. |
I was asked my opinion a few times, but only just got around to reading this whole thread (sorry, it's very long). Conceptually I have no issue with making it possible to inline the contents of a descriptor, with the caveats that:
But I will put forward a tentative LGTM, though I could be convinced either way. I personally think (and I suspect this is going to be a somewhat inflammatory comment) that we seem to be giving a bit too much consideration to distribution-spec implementations when discussing image-spec changes. Our job is to define the specification, and while input from registries can be very informative and we should take it on board when making decisions, I don't think "well, registries currently can't handle XYZ so we should come up with workaround ABC which has downsides DEF on our side which mean XYZ really is better, but that's the cost of doing business because we simply cannot inconvenience the registries" is a reasonable way of writing and maintaining a specification. If a proposed feature is solving a clear problem, then the fact that registries may be somewhat inconvenienced is not by itself an argument to reject the proposal. The examples given with registry incompatibility certainly would frustrate users (and I agree that it's not ideal) but this is a potential incompatibility created by the registries -- nowhere in the image-spec have we sanctioned arbitrary size limits and thus if registries want to add them, they need to make sure they aren't disrupting their own customers' usage of their registry. If a registry was created tomorrow which only allowed you to upload images that have a power-of-two size, we would not consider adjusting the image-spec to only produce power-of-two blobs to avoid incompatibility. Likewise, there are several other changes we may wish to make to the manifest in the future (restic-like chunking, adding a |
thanks for that aleksa |
I wanted to add a note from the ACR side since it was brought up on the call. This PR reduces the number of HTTP requests. Regarding the caching point that @vbatts mentioned, the team has been addressing this in anticipation of this PR. Regarding the conformance test I think we agreed that registries need not validate the field and it us up to the consumer to validate it. Overall LGTM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having heard from Azure ACR and AWS ECR, this has consensus. The primary hold-back has been ECR not accepting unknown fields, but Michael Brown gave thumbs up.
This should contain an embedded representation of the referenced
content, which is useful for avoiding extra hops to access small pieces
of content.
Fixes #825
Closes #675