Skip to content
This repository has been archived by the owner on Sep 23, 2022. It is now read-only.

Add proposal F #52

Merged
merged 2 commits into from
Jun 6, 2022
Merged

Add proposal F #52

merged 2 commits into from
Jun 6, 2022

Conversation

silvin-lubecki
Copy link
Contributor

@silvin-lubecki silvin-lubecki commented Jun 4, 2022

This proposal tries to follow up with all the comments made on the issue #50

Signed-off-by: Silvin Lubecki <silvin.lubecki@docker.com>
Copy link
Contributor

@sudo-bmitch sudo-bmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are non-blocking. We can merge and iterate. Race conditions and existing runtime support are the big questions for me, since we can't count on ETags with existing registries.

A new concern we didn't consider before is scalability, since this has limits on the size of the index manifest, and runtimes may limit the annotation size.

Example of a reference OCI descriptor, pointing to a SBOM referencing an alpine image on a specific platform:
```json
{
"mediaType": "application/vnd.cncf.oras.artifact.manifest.v1+json",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to pick a media type known to OCI here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean from this list or that one needs to be defined for SBOMs specifically?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That list looks right. For a manifest, anything other than a docker image, docker manifest list, OCI image, or OCI index is going to be rejected by most registries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa


#### Well-known annotations
A **well-known** annotations section is added to the [Annotations](https://github.com/opencontainers/image-spec/blob/main/annotations.md) section.
Those annotations **SHOULD** be added to the referenced image descriptor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question I'll have when assessing the proposal is what happens when the annotations don't match.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean if the annotations are not from the "well-known annotations" list ?
Then maybe I should add:
"An encountered annotation that is unknown to the implementation MUST be ignored." ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

> If multiple manifests match a client or runtime's requirements, the first matching entry SHOULD be used.

Example:
```json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```json
```jsonc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

> The distribution-spec **SHOULD** solve the race condition occurring when [multiple parties modify an OCI Index in parallel](https://github.com/opencontainers/distribution-spec/pull/251).
Querying for a reference involves pulling either existing tags pointing to an index that includes the artifact and image, or pulling new tags that use the digest of the referenced image in the tag value.

> Registries are expected to handle nested OCI Index correctly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added Testing is needed to verify that a registry supports nested OCI Index. in the Backwards Compatibility.2 section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@silvin-lubecki - What is the recommendation when the index size is more than 4MB. Implementations currently reject manifests that are too large.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is up to registry owners to decide. 4 MB is really big for an object that you expect people to fetch very often.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was discussed in opencontainers/distribution-spec#260 and officially clarified in opencontainers/distribution-spec#293 😅


### Registry HTTP API

> The distribution-spec **SHOULD** solve the race condition occurring when [multiple parties modify an OCI Index in parallel](https://github.com/opencontainers/distribution-spec/pull/251).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For existing registries that don't have this feature yet, are we accepting the race condition?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we need to. Depending on the time horizon for fixing this, we could provide guidance for how clients can work around the race.

I would worry that the guidance would become the standard though 😬

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could say that, if the response doesn't include an ETag, that the client SHOULD re-pull the manifest and check that the contents are what it expects, and if it doesn't, it SHOULD go through the pull-update-push dance again, until the update lands racelessly. Or, client's MAY consider the case an error and make users re-push until the race doesn't happen, possibly losing information along the way.

This seems like both a.) enough to ensure the race doesn't practically happen, and b.) a slow and annoying enough protocol that it still encourages registries to support ETags so it isn't necessary. If a client wants to assume the registry will support ETags they can, and just fail loudly if it doesn't.

This has the possible negative side effect of pushing the updated manifest multiple times, each time it may lose the race and need to retry. Again, this is more reason a registry should prefer not to need this, and just support ETags.

Ultimately there's not a lot we can do to make the protocol bullet-proof, except just promote ETags everywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could say that, if the response doesn't include an ETag, that the client SHOULD re-pull the manifest and check that the contents are what it expects, and if it doesn't, it SHOULD go through the pull-update-push dance again, until the update lands racelessly. Or, client's MAY consider the case an error and make users re-push until the race doesn't happen, possibly losing information along the way.

I agree with this, but I wonder if it should be detailed in this proposal. I think it's more a distribution-spec issue 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how long it takes to push various blobs, the client could have cached a manifests for a long while (in computer time) and overwrite a manifest that was pushed and verified by another client. In other words, there are lots of races, and we'll probably need to spend some effort specifying the client requirements to minimize (but not prevent) these races.

We also need to factor in that some tooling may see an artifact get pushed and attempt to intentionally delete that artifact, and we'd need a way to differentiate that from a race condition.

1. As an artifact producer, I want to update an existing artifact with a newer artifact.
- Yes: New artifacts are associated by pushing the artifact and the unique tag referencing the target manifest.
1. As an artifact producer, I want to push multiple artifacts concurrently (possibly from separate systems) without encountering race conditions or other errors.
- Yes: Each artifact may be pushed separately with a unique artifact manifest and tag that references the target manifest.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weren't we getting away from unique tags with this proposal? If multiple tags exist, do all manifests need to be pulled, and how does the client know which one is the newest?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that I understand the use case here. Is it intended to tease out that index modifications are racy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have clarified, multiple digest tags. I thought you were pushing for a single predictable digest tag, and not a different digest tag for each version of the extended index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, indeed, it's a copy/paste I should have resolved 😅
Changing to:
No: If two manifests with the same digest are pushed to the same registry path, a race condition will occur. However, this problem exists in current registries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

- E.g. `registry.example.org/project-d:sha256-0000000000000000000000000000000000000000000000000000000000000000.0404040404040404`
- `<alg>`: the digest algorithm
- `<ref>`: the referenced digest (limit of 64 characters)
- `<hash>`: hash of this artifact (limit of 16 characters)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the artifact hash still needed with this proposal? I thought that was going away to avoid the tag listing.

Copy link

@chris-crone chris-crone Jun 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed when you would like to add metadata to an existing object.

As the proposal is that you append metadata to the OCI index, you do not need to list tags. You only need to have your unsigned existing object's digest and then you can compute the tag (using the above formula) for the index with metadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that mixing up the reference digest with the artifact hash? The original index you want to extend is the first digest. The digest of the new index with the modified fields is the second hash, and that isn't known by consumers in advance so they need to use a tag list to find them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distribution spec doesn't seem to have a way to pull/push index manifests. Would the new OCI index be a nested index?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Distribution spec doesn't seem to have a way to pull/push index manifests." I'm confused, distribution-spec allows this. The nested index concern is with clients that assume an index is only 1 level and can find their platform by scanning the index without recursion.

My question above is getting into how many tags will there be pointing to index manifests for a given "thing". My original understanding is there's:
original tag -> original index or extended index
digest tag -> extended index

And the extended index includes an entry for the original index, and could have entries for each of the original index entries, flattening the search done by clients. That would mean there's only one digest tag for each referenced digest, pointing to the extended index, negating the need for the additional <hash> entry.

Copy link

@chris-crone chris-crone Jun 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nishakm

Distribution spec doesn't seem to have a way to pull/push index manifests

I'm not a distribution spec expert but I believe they're the same as other manifests

There's a useful diagram here to see what's happening.

@sudo-bmitch

That would mean there's only one digest tag for each referenced digest, pointing to the extended index, negating the need for the additional entry.

Yes, I believe that you're right. I didn't read it closely enough the first time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sudo-bmitch yes totally, removing it 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

- `<ref>`: the referenced digest (limit of 64 characters)
- `<hash>`: hash of this artifact (limit of 16 characters)

If the referenced image is an OCI Index, the manifests **SHOULD** be inserted first, for backward compatibility with runtimes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the referenced image is an OCI Index, the manifests **SHOULD** be inserted first, for backward compatibility with runtimes.
If the referenced image is an OCI Index, the manifests **MAY** be inserted first, for backward compatibility with runtimes.

I would change this to MAY as it's up to the people creating the artifacts if they want to provide backward compatibility or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be prescriptive here, so we can maximize the backward compatibility ? Also I don't see good reasons not to do it 🤔

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't see good reasons not to do it

It adds complexity and duplication which is not ideal. Will we still want this in 1-2 years time?

Shouldn't we be prescriptive here, so we can maximize the backward compatibility ?

By putting "MAY", we're making the how of backward compatibility well known but we allow tool builders to decide if they need it for their tooling.

I'd be curious to see what others think here. I won't die on this hill :)

},
// signatures added to the linux/arm64
"annotations": {
"dev.cosignproject.cosign/signature.e59879c": "...",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, the working group mission statement is:

Propose how to describe and query relationships between objects stored in an OCI registry.

@sudo-bmitch @nishakm @jdolitsky @imjasonh I believe that exactly how images are signed or SBOMs are stored is out of scope for this working group. Is that correct?

I raise this as I think we should be clear that we're using illustrative examples to solve for the working group's mission, not trying to define signing or SBOM for artifacts across the industry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the goal is to define how additional data is associated to an existing image. We don't want to get into signing formats. It might make sense to discuss this not as signatures, but an optimization for smaller data types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it helps, we could replace dev.cosignproject... with "org.favorite.icecream": "rocky-road" or something, to make it clear it's a demonstration of a value that's expected to be small, and not a specific signing implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tried to keep the proposal as generic as I could, those signatures only appear in the examples. Changing them with what @imjasonh suggested 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

"os": "linux"
},
"annotations": {
"dev.cosignproject.cosign/signature.e59879c": "...",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the hex suffixes here and org.opencontainers.signatures.index ? Or is this intentionally vague as the role of the WG is to not actually describe these keys?

Should we define/recommend a key structure for signatures. Eg. sign.<vendor>/<ftype>.<uniq>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a carry over from this comment: #50 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is, but I'm removing it for more generic annotation examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 47a0daa

#### Pre-defined annotations

Reference annotations are added to the [Annotations](https://github.com/opencontainers/image-spec/blob/main/annotations.md) section.
* `org.opencontainers.reference.digest`: digest referencing an image. The digest **SHOULD** be referenced as OCI Descriptor in the root OCI index **OR** in the nested OCI index to avoid a weak reference.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be worded as "SHOULD reference an OCI Descriptor"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when an artifact/image's reference annotation points to a nested index which references the same artifact/image? Is there any way a client can resolve this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when an artifact/image's reference annotation points to a nested index which references the same artifact/image? Is there any way a client can resolve this?

What would be the use case ? I don't really get the workflow that would produce it 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be worded as "SHOULD reference an OCI Descriptor"?

My intent was to enforce the image producers to make sure that if they reference a digest with this annotation, they also reference it using an OCI Descriptor, otherwise it's only a weak pointer and nothing prevents the registry from garbage collecting the referenced digest.

- E.g. `registry.example.org/project-d:sha256-0000000000000000000000000000000000000000000000000000000000000000.0404040404040404`
- `<alg>`: the digest algorithm
- `<ref>`: the referenced digest (limit of 64 characters)
- `<hash>`: hash of this artifact (limit of 16 characters)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distribution spec doesn't seem to have a way to pull/push index manifests. Would the new OCI index be a nested index?

Copy link
Member

@jdolitsky jdolitsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic. Thank you.

Could we also add a row to the table of proposals in the README? https://github.com/opencontainers/wg-reference-types#proposals

- Removing references to cosign in the examples
- Use jsonc
- Remove tag hash
- Fixed copy paste in requirements

Signed-off-by: Silvin Lubecki <silvin.lubecki@docker.com>
@silvin-lubecki
Copy link
Contributor Author

Could we also add a row to the table of proposals in the README? https://github.com/opencontainers/wg-reference-types#proposals

Fixed 47a0daa

Copy link
Member

@jdolitsky jdolitsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets put it in and iterate from there!

@jdolitsky jdolitsky merged commit 74a04fd into opencontainers:main Jun 6, 2022
@jdolitsky
Copy link
Member

For any unresolved topics from this PR, please add here: #53

Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants