Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Compute an uncompressed digest for chunked layers #2155

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mtrmac
Copy link
Collaborator

@mtrmac mtrmac commented Oct 30, 2024

… to ensure we can always use traditional layer IDs and image IDs, and use the uncompressed digests to avoid zstd:chunked view ambiguity, in c/image.

Absolutely untested.

Copy link
Contributor

openshift-ci bot commented Oct 30, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mtrmac

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mtrmac added a commit to mtrmac/image that referenced this pull request Oct 30, 2024
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/libpod that referenced this pull request Oct 30, 2024
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/libpod that referenced this pull request Oct 30, 2024
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/libpod that referenced this pull request Oct 30, 2024
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
pkg/chunked/storage_linux.go Outdated Show resolved Hide resolved
pkg/chunked/storage_linux.go Outdated Show resolved Hide resolved
}
defer fg.Close()
digester := digest.Canonical.Digester()
if err := asm.WriteOutputTarStream(fg, metadata, digester.Hash()); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the expensive part where we're basically re-synthesizing the tarball from the unpacked files and the tar split-data just to get the checksum, right?

I know this is just a PoC but it seems very much worth a comment at the top of this conditional, something like:

// If we don't yet have the uncompressed digest, compute it now. This is needed
// for zstd:chunked for example to ensure callers identifying images by this
// have the data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the expensive part where we're basically re-synthesizing the tarball from the unpacked files and the tar split-data just to get the checksum, right?

Yes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added, discussing more of the goals, costs and tradeoffs.

}
}

dirFD2, err := unix.Dup(dirFD)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't have O_CLOEXEC set

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Dup is no longer necessary.

}
filename = path
}
pathFD, err := securejoin.OpenatInRoot(fg.rootDir, filename)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(A little surprising securejoin doesn't have a high level securejoin.OpenReader or something)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We have an userns.secureOpen internally with two other callers — centralizing that, and proposing that upstream, does sound attractive.)

@@ -1734,3 +1754,63 @@ func validateChunkChecksum(chunk *internal.FileMetadata, root, path string, offs

return digester.Digest() == digest
}

func newStagedFileGetter(dirFD int, differOpts *graphdriver.DifferOptions, entriesWithOriginalNames []fileMetadata) (*stagedFileGetter, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know this codebase well but it seems surprising to me that we wouldn't have an equivalent of this function somewhere else, there has to already be code that's serializing zstd:chunked to tar in another place which has to do the same thing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a getter in layers.go which doesn’t support the composefs digest mapping (in that case we fall back to a different expensive way to generate the diff, IIRC).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree that having those FileGetter implementations scattered around the codebase suggest that a better design might be possible. OTOH the path/digest mapping is expensive to maintain in a ready-to-use form.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH the path/digest mapping is expensive to maintain in a ready-to-use form.

On the composefs side that's basically what the composefs blob is (plus all inline metadata) and it's pretty cheap to read back a deserialized abstract tree from a composefs blob, and a big bonus is that because fsverity is enabled on that file, validation is automatic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… yes, that’s a great point. There already is a special overlayFileGetter with composefs support, but it uses a temporary mount instead of reading the data directly.

It seems likely that reading the data in user-space would be better, and that could be reused here.

But that’s, also, not the focus of this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cc: @giuseppe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also possible to deserialize a composefs in userspace via composefs-info dump; for the use case of turning "composefs into tar" that can be done without mounting it at a kernel level if desired. We have Rust code wrapping that in our crates. I suspect what would be helpful is to try to create a more extensive Go library for composefs separate from this project that fleshes things out like this.

Basically a cool thing about composefs is that it can be natively kernel mounted and provide full verity and things like that, but also because it's just metadata you can think of it much like a tar-split that can be handled in userspace too and efficiently merged etc.

We will start validating them.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
- Use a map of struct{} to save on storage
- Track the items by path, not by the digest; that's one byte more
  per entry, but it will allow us to abstract the path logic.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…igest

... to centralize the composefs backing file layout design.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…sed digest

This will allow c/image to validate the uncompressed digest against the config's
RootFS.DiffID value (ensuring that the layer's contents are the same when pulled
via TOC and traditionally); and the uncompressed digest will be used as a layer ID,
ensuring users see the traditional layer and image IDs they are used to.

This doesn't work for chunked layers without a tar-split.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Copy link
Collaborator Author

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated:

  • Introduced composefs.RegularFilePathForValidatedDigest(d) to centralize the path mapping logic
  • Avoided the Dup() of the top-level directory file handle
  • Added more comments.

@giuseppe RFC. Still mostly untested, but closer to the final form, I think.

Comment on lines +1719 to +1721
// Also, layers without a tar-split (estargz layers and old zstd:chunked layers) can't produce an UncompressedDigest that
// matches the expected RootFS.DiffID; they will need to use a special TOC-based layer ID, and we need to somehow (??)
// ensure unambiguity - either by always falling back to full pulls, or by never falling back.
Copy link
Collaborator Author

@mtrmac mtrmac Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@giuseppe @cgwalters note this.

zstd:chunked layers have tar-split in the current format since b5413c2 , i.e. c/storage 1.54 == Podman 5.1.0.

I’m not 100% sure what to do here. I’m leaning towards outright refusing to partially pull estargz and old chunked layers (falling back to full pull), that is “clearly correct” and always possible. (Users who insist on always using partial pulls, e.g. for composefs, can rebuild their images.) Alternatively, we could always do a partial pull and refuse a fallback, but then we would need to break pulling those layers on other graph drivers (or, eventually, implement partial pulls for those graph drivers) — and we still would have the signing ambiguity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants