-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: When verifying, only use data after it is considered acceptable #2482
Conversation
6716343
to
b16b8b6
Compare
Codecov Report
@@ Coverage Diff @@
## main #2482 +/- ##
==========================================
+ Coverage 30.14% 30.33% +0.18%
==========================================
Files 139 139
Lines 8595 8624 +29
==========================================
+ Hits 2591 2616 +25
- Misses 5629 5632 +3
- Partials 375 376 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, thanks for this! Very good idea overall and I'm strongly in favor.
I'd like even better to enforce this at the type system layer, rather than simply via naming. You might be able to do some cool encapsulation tricks. (Don't make it possible to instantiate a Foo
struct from outside the module; instead, you must create an UnverifiedFoo
and then turn it into a Foo
.)
But in the shorter term—is there any chance you could break this up into 2 or 3 smaller PRs? I find once I'm reviewing more than 100–200 lines of code my eyes gloss over and I find it difficult to actually make sure the changes are correct.
+1, for PRs that both refactor and modify behavior, we should separate those into separate PRs, especially for critical parts of the code base like verify. Very minimal test changes for refactors, with more unit tests for behavior changes.
Is this correct? We default to the current time when checking expiration time. If no bundle or timestamp is provided, we use the current time. There should be no case when cert expiration is not checked, but let me know if you found one.
This sounds like a good change! Note that if both a bundle and 3161 time stamp are present, we want to check the times for both. Maybe we make this configurable in the future, but for now, I’d like to continue checking both, because we do expect both to be valid. |
YAY! Thank you -- I'll give a start reviewing soon. Also, I'm catching up on some lost context over the holiday week so apologies if I'm behind
We introduced a few as we were making the cosign 2.0 changes, but I ran out of time before the holidays to clean them up. Looking at main right now, you could get around if, let's say, you did |
Go doesn’t have a strong enough visibility system for that, other than using separate subpackages — and I think that would spread the core verification logic over too many separate files, making the end resulting logic hard to follow. (That’s especially the case for certificates, which really have several mostly-independent correctness criteria, like the various subject restrictions that can be configured.) There’s certainly much more that could be done (e.g. maybe the
Sure, the set of commits should be correct at every point, so It can split in arbitrary places. |
b16b8b6
to
923c9ef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is looking great! i've just reviewing the verify main command.
pkg/cosign/verify.go
Outdated
return false, fmt.Errorf("offline verification failed") | ||
} | ||
|
||
if co.RekorClient != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this check was always very confusing to me. It seems like if you aren't skipping TLOG verification, then if there's no rekor client, somehow you can still pass without checking the tlog.
do you think maybe it should be an error that co.RekorClient is nil when we have to do tlog verification and never got a bundle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, the API is such that external callers can call VerifyImageSignature
and get (bundleVerified = false, err = nil)
. So this PR doesn’t change that, it’s a large enough set of changes already.
I agree that it seems dangerous and undesirable; IMHO external users of pkg/cosign
should just provide a policy, and not be relied upon to correctly enforce other conditions on top. Conceptually, I think “change the API philosophy” and “change the implementation philosophy” are fairly separate conversations — OTOH, admittedly, the question of the right timestamp to use for certificate expiry means the two can’t separated that cleanly.
@@ -154,7 +154,7 @@ func (c *VerifyCommand) Exec(ctx context.Context, images []string) (err error) { | |||
if err != nil { | |||
return fmt.Errorf("getting Fulcio roots: %w", err) | |||
} | |||
co.IntermediateCerts, err = fulcio.GetIntermediates() | |||
co.UntrustedIntermediateCerts, err = fulcio.GetIntermediates() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we keep the same name, IntermediateCerts
?
I would prefer we don't used "Untrusted" in the name. This term comes from openssl from what I've seen, and I personally don't think it's correct to only use intermediate certificates for chain building ("untrusted"). Intermediates should be trusted, or at least come from a trusted source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is confusing. These particular ones are coming from a trusted source, so I can agree to see it named the same. The ones found on the registry through CertChain are untrusted intermediates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see co.UntrustedIntermediateCerts = untrustedPool
. So, in that sense, I think changing the name of CheckOpts
to UntrustedIntermediateCerts
has been a success exactly because the trusted nature of those coming from fulcio.GetIntermediates
is insufficient to make the field trusted.
(I’m also not sure what a “trusted intermediate” means. Is that a root of trust, i.e. a non-intermediate root? Or an intermediate that we still require to be correctly signed by a root of trust, i.e. something where we don’t actually require any trust?)
Actually, what I really think is that the caller-provided policy should be immutable and should not be modified to hold single-instance data within verify.go
at all. Then we might even more precisely differentiate between
- completely-untrusted signature-carried data
- externally Fulcio-provided data (acceptable if the connection to Fulcio is trusted)
- completely locally-provided 100%-trusted policy
and potentially make different decisions based on those details. That would be a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a core library function, I think we should not be opinionated if the certificates are trusted or untrusted. For example, we don't know where the root certificate came from either when the checkopts is populated. I don't see a reason to distinguish between untrusted and trusted in the CheckOpts. Internally, we can name variables differently based on their source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not be opinionated if the certificates are trusted or untrusted.
I… don’t know what that would mean.
The verification code must be certain about which data is a part of the root of trust that is, by necessity, assumed valid, and which data is part of the external attack surface that is, by default, suspect. At the very least these two categories are 100% distinct. There might be even more categories in between (e.g. the intermediate certificates might be more trusted than the underlying signatures, but they are still potentially a part of the attack surface).
(And then the API should help the caller of the API express its assumptions clearly, and to help expose any incorrect assumptions the caller might be making.
From that point of view, calling the certificates “untrusted” is good for the caller, because it tells the caller the caller doesn’t need to make much effort at all to ensure validity of the data. “A trusted system is one whose failure would break a security policy”.)
I don’t care at all about the individual naming of the options. If you don’t want the field to be named Untrusted
, and to instead have a CheckOpts.{TrustedSigVerifier,TrustedRootCerts}
, sure, that might be even more explicit. If you think the distinction between trusted and untrusted data is sufficient to express in comments and doesn’t need to be in field names, that’s not my preference but I can live with that just fine.
But I read the above as suggesting that it doesn’t make a difference whether the data is trusted or not, and I just don’t know what to do with that. Probably I’m completely misunderstanding your point.
My 2cents: I see multiple types of refactoring in this PR - Name changes, reordering, etc. Given the criticality of this package, I think we should scope down the PR even more. I see some of the commits are behavior changes too - These definitely should be in their own PR, especially since I don't see tests that accompany them (which suggests we don't have sufficient testing in Cosign, which we don't :) ). |
Right, I have rebased this first to have a baseline from which I can extract a smaller subset. Then I intend to keep this PR to be a catch-all that merges branches of the smaller subsets, keeping a record of what is merged/being reviewed / outstanding (and dropping the rejected parts). If/as time allows. |
@mtrmac Do you think it'd be possible to have a PR just with the behavior changes first? We wanted to cut a release candidate soon for cosign 2.0, and one of the behavior changes in particular is a blocker for release imo:
This should result in a verification failure. Reproduced with fyi @priyawadhwa |
#2504: as it happens, I was planning to add that one on top of the previous request. If you want to urgently backport the other change, feel free to; I’m afraid I’m done for the week. |
923c9ef
to
594d1e6
Compare
Summary This is a subset of #2482 as requested in #2482 (comment) (a bit larger than that): User-observable: verify RFC 3161 timestamps, and Rekor data, first, before processing certificates. Conceptually we need to do that because certificate processing requires a timestamp (although that’s not yet what this PR finishes doing). This can affect user-observable error messages, which can now complain about untrusted signed Rekor data instead of untrusted signed certificates. Behavior change: If there is a certificate, always verify it against some timestamp (in particular, the current time, if there is neither a RFC 3161 timestamp nor any Rekor data. (This is more than was asked — I included it because the current behavior is such a surprise for callers, and because always verifying actually makes the code simpler. I can split that into some future PR, and add an explicit “are we dealing with Rekor” condition instead, to preserve the current behavior.) Code consolidation: verifyInternal now has one place (although not yet the correct place) to do certificate expiry checks, instead of that happening in VerifyRFC3161Timestamp See #2482 for previous discussion, and the general rationale of only using data after it is verified (which is not what this PR really does, but it’s a prerequisite). Release Note cosign verify-blob --cert-chain … --certificate … --skip-tlog-verify now requires the leaf certificate to not be expired. Signed-off-by: Miloslav Trmač <mitr@redhat.com> Co-authored-by: Hayden B <hblauzvern@google.com>
We've merged the initial PR. Note that there have been a lot of changes in verify.go, so you'll likely have conflicts. |
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... to track the trust status of the Rekor bundle Signed-off-by: Miloslav Trmač <mitr@redhat.com>
And validate the public key first, the signature second, just for consistency with the flow of trust (although that really doesn't mattere here, we want all of (key, signature, payload) to match). Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…61 timestamp Reorganize the certificate expiry checks, so that we check against the accepted timestamps if any, and only fall back to current time if there is no data. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Use certWithUnverifiedExpiry , which we already have, and which we care about. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Finally, only create the verifier based on an actually acceptable certificate, instead of creating it first and then hoping not to forget to validate preconditions. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... to document the individual stages. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
594d1e6
to
deebfc5
Compare
Sure, rebased. Do you have a preference for a subsection to propose next? Some options:
|
If you want quick reviews, we can start with the smallest, reordering VerifyBundle, then var renames? |
This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
This PR was closed because it has been stalled for 10 days with no activity. |
Summary
pkg/cosign.verifyInternal
keeps growing conditions (A) at the end of the function, long after already relying (B) on data that is constrained by these conditions. It effectively makes the outcome of B indeterminate until A happens; that’s hard to track, and A might not happen at all if the function returns early. This makes me nervous: personally for me, it makes following the operation and rules of the code much harder than I think is necessary.So, this is a move somewhat towards, but not completely, the verification organization suggested in #1648 (comment).
Most data in
verifyInternal
, and closest utility functions, is now stored with variables nameduntrusted…
. Only after all checks on that data are done, it may be stored in a variable namedacceptable…
.Testing relies on existing unit tests; no new tests were added.
See individual commit messages for details. Most of the commits are trivial variable renames.
Behavior changes:
Basic summary of the ordering changes:
VerifyBundle
now first ensures that the entry was actually stored on Rekor, and only afterwards uses it to see whether it matches the supposedly-logged signature.Release Note
Certificate expiry checking was made a bit more accurate.
Documentation
N/A.