Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oci scheme changes the mediaTypes #1692

Closed
developer-guy opened this issue Jun 27, 2022 · 6 comments
Closed

oci scheme changes the mediaTypes #1692

developer-guy opened this issue Jun 27, 2022 · 6 comments

Comments

@developer-guy
Copy link

We (w/@Dentrax) noticed that while copying a Docker-typed image to a directory by using the oci scheme, the mediaTypes of the image are changed from vnd.docker.. to vnd.oci.., and it's leading to a change the digest of an image. But in the crane tool, mediaTypes remains the same for the same operation. So we wanted to open an issue to discuss which type of operation is correct or makes sense.

To reproduce the issue, we can consider using alpine:3.16 image:

$ skopeo inspect docker://alpine:3.16 --raw | jq ".manifests | map(select(.platform.architecture == "amd64" and .platform.os == "linux") )" | jq -r '.[0].digest'
sha256:4ff3ca91275773af45cb4b0834e12b7eb47d1c18f770a0b151381cd227f4c253
$ skopeo copy --override-os linux  docker://docker.io/library/alpine:3.16 oci:oci-layout
$ skopeo copy --override-os linux oci:oci-layout docker://devopps/alpine:3.16
$ skopeo inspect docker://devopps/alpine:3.16 | jq -r '.Digest'
sha256:1db22e12238c94042b930c0e3559a8b283473b989d621f2c145ebe72829cef25 

@mtrmac @imjasonh @jonjohnsonjr

@vrothberg
Copy link
Member

@developer-guy is the assumption that converting between OCI and Docker format preserves the digest?

The digests will change since the media types are different. A Docker manifest does not comply with the OCI image spec, so it must be changed during conversion.

@mtrmac
Copy link
Contributor

mtrmac commented Jun 27, 2022

Yes, the OCI image spec (which the oci: and oci-archive: aim to implement) only supports OCI-formatted images: compare https://github.com/opencontainers/image-spec/blob/main/image-index.md#image-index-property-descriptions . Storing something else is not strictly prohibited but it is not interoperable (“An encountered mediaType that is unknown to the implementation MUST be ignored.”)

If you don’t care about OCI compliance and interoperability, use something else; either the c/image-private dir: format (which doesn’t quite have a format stability promise, to be fair), or maybe a temporary local registry server run from a container.

@mtrmac mtrmac closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2022
@jonjohnsonjr
Copy link
Contributor

the OCI image spec (which the oci: and oci-archive: aim to implement) only supports OCI-formatted images

I think there's enough wiggle room in the spec that you could read this either way. I don't see anything that forbids an OCI image layout from containing a docker image (e.g. see this example which has a non-OCI entry in index.json), and most of the wording in the index/image specs leave room for backward compatibility. I wouldn't argue that skopeo's behavior is necessarily incorrect, but I don't think it would be wrong not to convert them to OCI (especially if you care about preserving the content-addressable bits).

@mtrmac
Copy link
Contributor

mtrmac commented Jun 27, 2022

I agree it’s possible to store images that way.

I do think that the default behavior needs to be to convert, so that they can be read back by another implementation with high confidence. It might make sense to store unconverted images with an opt-in.

The current implementation of c/image transports has the concept of “supported” manifest formats, where an image in a format supported by a destination is stored there without a format conversion (while oci: returns only the OCI format as supported, triggering a conversion if necessary). We’d probably need a concept of “supported opt-in only”, so that a v2s2 image is ever automatically stored as v2s2 in oci:, but users could explicitly opt-in via skopeo copy --format v2s2 to have the image stored in OCI using a v2s2 format. That’s plausible (and eventually we might want to do something similar to stop automatically converting to v2s1 when pushing to registries[1]).

Looking a bit more at use cases, the extra valuable feature of OCI unavailable in dir: is the ability to store multiple images in a single directory, sharing storage. That’s true at at medium scale, of >1 image but probably not hundreds; for hundreds, a real registry (with individual repos that can be updated concurrently, without contention/races on the index file, while still allowing data reuse across the whole set of images) is probably better. So… there’s some value to supporting non-OCI formats in OCI, but it’s fairly limited. #1237 would hypothetically be another way to address the major use case.


[1] Surprisingly, v2s1 happens to be the only format that works, from time to time, typically if due to c/image implementation bugs, or strict registry validation, OCI and v2s2 is rejected. It’s not always 100% clear whether converting and things transparently mostly-working, or cleanly failing, is the better behavior.

@Dentrax
Copy link

Dentrax commented Jun 27, 2022

@developer-guy is the assumption that converting between OCI and Docker format preserves the digest?

@vrothberg Since skopeo mutate all mediaTypes (since we're exporting as oci), digests changes for sure. In the current use-case, I couldn't able to trust official image since its digest changed. But it is also obvious that is the official image since we did not mutate anything expect media types.

# digest of docker://foo1
sha256:4ff3ca91275773af45cb4b0834e12b7eb47d1c18f770a0b151381cd227f4c253
$ skopeo copy docker://foo1 oci:foo2
$ skopeo copy oci:foo2 docker://foo2
# new digest of docker://foo2
sha256:1db22e12238c94042b930c0e3559a8b283473b989d621f2c145ebe72829cef25 

One workaround is to traverse all JSONs and replace all media types to vnd.docker and push the upstream. But it doesn't seem a good solution in the first place and sounds wrong to me.

What's the best and right way to copy the image by respecting the digest? upstream -> local -> upstream

@mtrmac
Copy link
Contributor

mtrmac commented Jun 27, 2022

One workaround is to traverse all JSONs and replace all media types to vnd.docker and push the upstream.

That’s not going to work in general; the digest detects format modification of any kind, including change in whitespace. There is no relationship between the whitespace of the input manifest and of the manifest produced by Skopeo when converting, so modifying Skopeo’s output that way is not typically going to reproduce the original manifest.


What's the best and right way to copy the image by respecting the digest? upstream -> local -> upstream

  • dir:, for a single image or a small number.
  • For a large set, run a registry in a container with a volume for the registry’s storage location, copy images to that registry, move the directory backing the volume, run a registry in a container again, copy images from that registry

Sure, the latter recommendation is a bit of a hassle … the thing is, a production-level registry implements most of the desirable features already.

In both cases you should probably use skopeo {copy,sync} --preserve-digests to get failures instead of digest changes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants