Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verification may fail for in-toto statements with multiple subjects using different hash algorithms #360

Open
kommendorkapten opened this issue Dec 18, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@kommendorkapten
Copy link
Member

Description

When verifying that an in-toto statement matches an artifact, we first start by choosing the "strongest" hash algorithm:

if _, ok := statement.Subject[0].Digest[alg]; ok {

Then we compute the digest using that algorithm and then compare the digest against what's in the in-toto statement. This would fail if a statement have multiple entries, referencing different artifacts using different digests. So as an example:

statement have two subjects (ordered list in the statement):

  1. foo: sha512
  2. bar: sha256

if we verify against the artifact bar, it would fail as we pick sha512, but that's not what's specified in the satement.

Version

@kommendorkapten kommendorkapten added the bug Something isn't working label Dec 18, 2024
@codysoyland
Copy link
Member

I remember writing this! I landed on this solution as the simplest way that should cover the vast majority of multi-subject attestations, however it does make the assumption that each subject in the attestation uses the same digest algorithm(s).

If we would like to support multi-subject attestations that use different digest algorithms per subject, we would need to hash the input io.Reader multiple times, once per algorithm. The simplest way to do that would be to copy the bytes from the io.Reader to a buffer and use that as the input to multiple hashers. That can be done in small blocks to avoid using too much memory for large artifacts. In any case, this would increase the complexity of this section of code.

I felt at the time that it was unlikely that somebody would produce a multiple-subject attestation with different algorithms per subject, but I suppose that could happen. Is this something that is important to support?

@codysoyland
Copy link
Member

I got a bit nerd-sniped and wrote a tool that can compute multiple hashes at once.

I'm still not sure if it's worth the added complexity, but this proves it can be done in a memory-efficient way: io.Copy defaults to a 32KB block size, so the multihasher does not need to buffer the whole file.

@kommendorkapten
Copy link
Member Author

I can see this may become an issue in the future, but still as an edge case. I would vote to implement the multi hasher, it's fast should doing one or two more hashes shouldn't be an issue. We can even start to iterate over the digests to build out a set of hashes listed, then have an allow list to filter against, then perform the matching. This means that in the most cases, we will still only be hashing with one algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants