Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Provide an easier way to retrieve SBOM from In-Toto attestation #2307

Open
ChristianCiach opened this issue Oct 4, 2022 · 15 comments
Labels
enhancement New feature or request

Comments

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Oct 4, 2022

Since #1756 cosign issues a warning when using cosign attach sbom and cosign download sbom, heavily encouraging users to use cosign attest and cosign download attestation instead:

WARNING: Downloading SBOMs this way does not ensure its authenticity. If you want to ensure a tamper-proof SBOM, download it using 'cosign download attestation <image uri>' or verify its signature.

While I don't really like this change, because all our sboms are signed and there is no need to scare the users like that, we are trying to migrate to use said attestations.

Unfortunately, there doesn't seem to be an easy way to retrieve the original SBOM document from an attestation. The cosign download sbom could be used to retrieve the original SBOM (and nothing else), but this is a lot harder when using attestations:

$ cosign verify-attestation --key cosign.pub --type cyclonedx my-registry/my-image:1.0.0 | jq '.payload | @base64d | fromjson | .predicate.Data'

This is the easiest way I could come up. This also requires the user to have additional tools available (jq in this case).

I think there should be a command to retrieve the original SBOM (producing the exact same output) as it is already possible with cosign download sbom. Currently, SBOMs attached as attestation-predicates feel like second-class citizens.

@ChristianCiach ChristianCiach added the enhancement New feature or request label Oct 4, 2022
@ChristianCiach
Copy link
Contributor Author

ChristianCiach commented Oct 4, 2022

Slight off-topic: I think there should be a way to avoid the warnings when using cosign download sbom and cosign attach sbom. Maybe like this: We could add a --key option to both commands. If present, cosign download sbom automatically verifies the signature of the downloaded sbom-image against the given public key and cosign attach bom automatically signs the attached sbom using the given private key. This would be consistent with the rest of the CLI commands.

Edit: I've created the issue #2528 to continue the discussion I've started in comment #2307 (comment).

This issue should continue to focus on making it easier to extract an SBOM from an in-toto attestation.

@ChaosInTheCRD
Copy link
Contributor

In relation to your first question, do you think that a flag such as --view-predicate would suffice here? This flag would ensure that if called it can strip out and decode the predicate from everything wrapped around it.

@ChaosInTheCRD
Copy link
Contributor

on the second question - in order to sign the SBOMs, have you up until now been using cosign sign-blob?

@ChristianCiach
Copy link
Contributor Author

ChristianCiach commented Oct 6, 2022

do you think that a flag such as --view-predicate

Yes, that would surely suffice.

in order to sign the SBOMs, have you up until now been using cosign sign-blob?

No, we are calling these to commands directly in succession:

cosign attach sbom ...
cosign sign --attachment sbom ...

The latter command seems to be little known, since I've never seen it mentioned anywhere. That's why I created a PR: #2308

I think it would be nice if we could combine these commands into cosign attach sbom --key ... to attach and sign the sbom in one swoop and suppress the annoying warning this way.

@znewman01
Copy link
Contributor

#2526 points out that have docs such that when you follow them, Cosign emits warnings: https://docs.sigstore.dev/cosign/other_types/#sboms-software-bill-of-materials

@ChristianCiach
Copy link
Contributor Author

ChristianCiach commented Feb 3, 2023

After some investigation yesterday, I think this issue is even more important than I thought.

Just a few months ago @otms61 created a PR for Trivy so that Trivy now natively supports scanning cyclonedx-predicates embedded in attestations: aquasecurity/trivy#2652

This PR has some shortcomings:

  • It only supports cyclonedx, no other formats like spdx.
  • It only supports the non-standard predicate-flavour created by Cosign, as described in Why is an SBOM attestation predicate wrapped in CosignPredicate? #2126 . Trivy will not accept predicates that follow the In-Toto specification (which states that the predicate payload should be the direct child of the predicate attribute, not wrapped under Data.
  • If the attestation contains multiple predicates of the same type, it looks like Trivy will only scan the first one.

Why I am telling you this here instead of creating an issue over at Trivy? Well, my point is that it should not be the burden of Trivy to support all kinds of "container" formats that SBOMs could be embedded in. If cosign would provide a way to easily retrieve the original SBOM from an attestation predicate, then aquasecurity/trivy#2652 would be completely obsolete and @otms61 surely wouldn't have seen a need to even create the PR.

I propose a new option for the cosign verify-attestation --type <type> command, maybe something like:

cosign verify-attestation --type <type> --decode

If the attestation contains more than one predicate of the given type, cosign could just concatenate the decoded predicates (see jsonl format). What do you think?

@lcarva
Copy link
Contributor

lcarva commented Sep 25, 2023

I can take on adding the --decode flag. I think we just need to iron out some details.

Here's my understanding. When using this flag, the output of the two commands below should be almost the same:

cosign verify-attestation ... --decode
cosign verify-attestation ... | jq '.payload | @base64d | fromjson | .predicate'

The description suggests the jq expression to end in .predicate.Data but that won't work with all attestation types I'm aware of. For example, I have an image from a side project that has a SLSA Provenance and an SPDX SBOM attestation. They can be retrieved like this:

cosign verify-attestation quay.io/lucarval/festoji:latest \
  --type spdx \
  --certificate-github-workflow-repository lcarva/festoji \
  --certificate-identity 'https://github.com/lcarva/festoji/.github/workflows/package.yaml@refs/heads/master' \
  --certificate-oidc-issuer 'https://token.actions.githubusercontent.com'

cosign verify-attestation quay.io/lucarval/festoji:latest \
  --type slsaprovenance \
  --certificate-github-workflow-repository lcarva/festoji \
  --certificate-identity 'https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_container_slsa3.yml@refs/tags/v1.7.0' \
  --certificate-oidc-issuer 'https://token.actions.githubusercontent.com'

Neither one of those would work with the suggested .predicate.Data expression.

@ChristianCiach, is using .predicate instead of .predicate.Data a problem for your use case?

@ChristianCiach
Copy link
Contributor Author

ChristianCiach commented Sep 25, 2023

@lcarva First of all, thank you!

@ChristianCiach, is using .predicate instead of .predicate.Data a problem for your use case?

It is my understanding that .predicate is now correct in all cases, since the nesting under .Data was mistake that has been fixed in #2718

@ChristianCiach
Copy link
Contributor Author

ChristianCiach commented Sep 26, 2023

It just occurred to me that my proposed solution (--decode to decode the payload and concat the resulting documents) may be a bit too naive.

I am not an expert when it comes to the in-toto format, but as far as I understand it, there can be multiple predicates for the same type. For example, when you do cosign verify-attestation quay.io/lucarval/festoji:latest --type spdx, the output may contain multiple spdx-documents.

In #2307 (comment) I proposed to just concat the JSON documents, using one line for each document (see jsonl format). This works nicely if the decoded payload is actually JSON, but it is my understanding that the decoded payload can be everything, including binary blobs. Of course, concatting binary output is obviously a bad idea..

So what should be do? Options:

  • Only support --decode for JSON-like payloads. This also means that we have to remove any linebreaks from the decoded payload. This seems too specialized. I actually don't like that.
  • Change --decode to --decode-first or maybe even --decode=0 (or --decode=1 and so on) to only decode a single payload and only print this to stdout. I think I prefer this option.

Edit:

  • Or just crash with an error if there is more than one document for the given --type.

@lcarva
Copy link
Contributor

lcarva commented Sep 27, 2023

Good points!

I have a suggestion that might help.

Given that a certain predicate type is expected to have a certain structure, it would be very unusual to have a mix of data types when filtering by a particular predicate type.

And that, AFAICT, it's not possible to use cosign verify-attestation to handle more than one predicate type in the same command execution.

And that we mostly care about JSON data for this particular feature.

And that multiple JSON documents can be easily represented with JSONL. (One JSON document per line.)

Then, let's introduce a new output type called json-predicate (or decoded-json) to trigger the behavior we get today with jq '.payload | @base64d | fromjson | .predicate'

If someone uses --output=json-predicate with a predicate that is not in JSON format, then an error is expected. In the future, we could add support for things like xml-predicate, for example.

Thoughts?

@ChristianCiach
Copy link
Contributor Author

ChristianCiach commented Sep 27, 2023

That would be perfectly fine for me, since I am only interested in cyclonedx-json predicates.

That being said, I don't feel it is particularly smart to specialise in specific predicate types. Your solution seems a bit too custom-tailored to my specific use-case.

That being said, cosign already is specialised for specific predicate types (since there is special support for certain --type values). Speaking of --type, doesn't it make --output=json-predicate redundant? If I specify --type cyclonedx, then "json" should be implied, right? It doesn't make sense to specify any different --output format if json is already guaranteed by the given --type.

@ChristianCiach
Copy link
Contributor Author

ChristianCiach commented Sep 27, 2023

It looks like I've underestimated the complexity of this feature request. I am perfectly comfortable with continuing to work out the details before we implement some solution that turns out to be a bad idea.

@lcarva
Copy link
Contributor

lcarva commented Sep 27, 2023

It also seems like my assumption is incorrect. CycloneDX appears to use the same predicate type for both JSON and XML. So my suggestion may be naive. It also means that inferring the data format from the predicate type is not a viable solution.

To be fair, I do not know of many instances where the predicate is not JSON. My perception is that this is the pseudo-standard format for predicates, but admittedly I don't have a large sample size to give this statement any merit. But! My suggestion is not specifically targeting your use case; it's also targeting mine. Having to use jq is not the best user experience.

Having said all that, I still think my suggestion is reasonable and should work for most cases. That's ok because it's optional - the behavior is not being imposed on anyone. We can always improve it over time as other use cases become clearer.

I would be interested in also hearing what others think in case @ChristianCiach and I just happen to have a narrow view of the problem.

@ChristianCiach
Copy link
Contributor Author

Thanks for your insight.

Maybe it's wise to just do nothing for now. When cosign releases its next version and users are facing the deprecation warnings for the sbom attachments, I expect we will see more users asking for ways to produce the same output as cosign download sbom did before.

@ChaosInTheCRD
Copy link
Contributor

Hey all, great to see that this issue is still being crunched through in my absence!

I am pretty interested in how this issue shapes up looking towards a proposed implementation. I will probably at some point be looking to do similar in the witness project! Everything you have both discussed is very valid and I think it's important to consider as close to every possible edge case before implementing something. I do think however that the biggest use case for a feature like this is the "I am learning / debugging and I just want to view the predicate". If that can be done without creating situations where people feel upon using it that it is dysfunctional due to a lack of support of their specific scenario, I think it would be a worthwhile feature 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants