Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy/multiple: priority of instanceCopyCopy must be higher than instanceCopyClone #2059

Merged
merged 1 commit into from
Jul 27, 2023

Conversation

flouthoc
Copy link
Contributor

When copying multiple images i.e instanceCopyClone and no image exists in registry in such case if clones are prepared and copied first then original copies will reuse blobs from the clone which is unexpected since argument requireCompressionFormatMatch is by default false for instanceCopyCopy case.

Problem described in following PR is easily reproduceable when working with tools such as buildah, because without this PR buildah will push clones first instead of originals and later on originals will reuse blobs from clones.

Reproducer with empty registry

./buildah manifest create localhost:5000/list:latest
./buildah build -t test .
./buildah manifest add localhost:5000/list:latest test
# Push with replication for `zstd`
./buildah manifest push --tls-verify=false --all --add-compression zstd localhost:5000/list:latest

When copying multiple images i.e `instanceCopyClone` and no image
exists in registry in such case if clones are prepared and
copied first then original copies will reuse blobs from the clone which
is unexpected since argument `requireCompressionFormatMatch` is by
default false for `instanceCopyCopy` case.

Problem described in following PR is easily reproduceable when working
with tools such as buildah, because without this PR buildah will push
`clones` first instead of originals and later on `originals` will reuse
blobs from `clones`.

Signed-off-by: Aditya R <arajan@redhat.com>
@flouthoc
Copy link
Contributor Author

@mtrmac A small bug which I found while working with buildah, I have explained issue in PR description.

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.


One thing I’ve been idly wondering is whether we shouldn’t have a gzip annotation as well. That would allow us to determine whether an image is Zstd-only / gzip-only, or mixed. (Alternatively, we can just load the instance manifest and run the MIME type heuristic… and it might be possible to structure the code so that this has no extra cost.)

If we could differentiate single-algorithm images from mixed images, we could set requireCompressionFormatMatch for instanceCopyCopy if Ensure…Variants is set. (OTOH I don’t know what we would do for mixed instances on input.)

I think such architecture changes are not welcome right at this moment, but I wanted to just mention this idea in case other use cases pop up, or to see if there is some other better approach I couldn’t think of.

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 27, 2023

Thanks!

@mtrmac mtrmac merged commit da78992 into containers:main Jul 27, 2023
@flouthoc
Copy link
Contributor Author

flouthoc commented Jul 27, 2023

@mtrmac I agree with your comment above, I've found a more scalable solution for this case, WDYT about this flouthoc@ab9066c ?

This will correctly ignore blobs for instanceCopyClone even if they are found in any reusable candidates agnostic of order.

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 27, 2023

  • Consider an input image with one instance, containing a mixed set of layers with some future algo1+algo2 , with Ensure…{algo1, algo2, [algo3]}. The way we currently handle mixed images, that would issue a Copy for the algo1+algo2 image [and a Clone for algo3] — but if we set require…Match for the mixed algo1+algo2 image, that’s going to strip one of the two algorithms.
    • Conceptually that’s a concern for mixed gzip/Zstd images already, not just future algorithms, except that annotationsToCompressionAlgorithmNames currently assumes that an image is never mixed.
  • As is, this doesn’t work for a zstd image with Ensure…: { Gzip } ; nor for a Zstd image with Ensure…: { Zstd }. I think that’s sort of a simpler version of the above problem.
  • Less important details:
    • I think it would be more predictable for uses if ifCloneOp (with some other name?) were set if the caller sets Ensure…Exist, regardless of whether we actually do a clone.
    • The implementation can set fields in copySingleOpts… and prepare… can do all of that (and potentially be unit-tested), instead of making the harder-to-test execution loop more complex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants