-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Moving away from SETs #1566
Comments
@woodruffw @FiloSottile @kommendorkapten @bdehamer @AlCutter @bobcallaway @znewman01: I'd appreciate your feedback on this proposal, and please tag anyone else you think might be interested in the topic. |
Thanks for the tag! I think this is a great idea, and resolves a big chunk of ambiguity on the client implementing side (i.e. "why do we verify both the SET and inclusion proof, and what's the difference between verifying the two in As a clarifying point: when we're talking about verifying the inclusion proof, we're talking about both the Merkle inclusion proof and the associated checkpoint signature, right? My understanding is that both are necessary to produce a proof that's as convincing (in an offline setting) as the SET, since without the checkpoint signature the Merkle proof is "just" a non-cryptographic self-consistency proof.
Agreed --
This would be my personal preference, but I recognize that might cause more compatibility heartburn than is worth it 🙂. Another possible option here is to add some normative language saying that bundles without inclusion proofs |
Correct, the inclusion proof must be accompanied by a checkpoint. The bundle format requires this. For some extra info, one thing I glossed over at the end of this issue is whether this checkpoint is the "latest" checkpoint (created immediately after entry inclusion) or a "stable" checkpoint (periodic checkpoint that is co-signed by witnesses). I discussed that more in a doc on witnessing.
Let's see what sigstore-js and sigstore-java are doing currently. I think sigstore-js is storing the proof too (https://github.com/sigstore/sigstore-js/blob/0f9915027ea1d1c0d05ffd4948d8f01a1f3f2c99/packages/client/src/types/sigstore/serialized.ts#L33-L40) - Is this correct @bdehamer? Java also looks to be storing it - https://github.com/sigstore/sigstore-java/blob/8f4794c6cfc8571d16243f393e343a25cccd7470/sigstore-java/src/main/java/dev/sigstore/bundle/BundleFactoryInternal.java#L129
I think this would be a reasonable trade-off, though one downside is the privacy aspect of revealing entries of interest to the log operator. |
Great so see progress here, nice work @haydentherapper !
Yes, this has to be clearly defined. Based on earlier discussions, I believe that a periodic checkpoint would be preferable as it simplifies monitors to gossip about the state. And this makes we wonder, if the checkpoint is signed at time T, and the inclusion proof is made at time T + 1, it's not offline verifiable, right? Or is the idea that the checkpoint will be created and signed at time T + 1 (i.e time of inclusion in the log), but Rekor would "publish" stable checkpoints periodically for consumption for monitors, to make it easy to gossip (that is, all inclusions generates a checkpoint, but monitors only verifies using the "published" ones)? If so, was there an idea on how this "publish" mechanism would happen? (I haven't read the doc on witnessing yet).
$ curl -s https://registry.npmjs.org/-/npm/v1/attestations/sigstore@1.6.0 | jq | grep inclusionProof
"inclusionProof": null,
"inclusionProof": null, CanonicalizationSomething that I don't think this issue addresses is the issue we are facing with canonicalization of entries? As the entry that is added to the Merkle tree is the JSON canonicalized representation, which has been proven to be hard (non deterministic for certain types) to implement, and that's why the bundle contains the canonicalized entry (which we want to get away from), see this doc and this issue for earlier discussions. If we perform a big change like this, I would suggest also making sure that we improve/simplify the leaf creation prior to inserting it into the log so it's deterministic. |
Overall, this change SGTM. My main concern is that it should be compatible with future protections against split-view attacks. Sounds like you're keeping that in mind! But maybe fleshing out the "stronger witness verification" plan and how it interacts with this change would assuage some of the concerns. No need for a full doc at this point, just:
Also make sure we update the Client Spec (which it sounds like you're already thinking about). Sounds like we're figuring out the migration story. I think I'd like to see a table for each implementation indicating what they're producing and what they're consuming at the moment. Finally, would we ever need to go back to SETs? There are real performance/consistency reasons that CT chooses to use SCTs and have a long merge time. |
I'll get |
We now publish a stable/periodic checkpoint, which is produced every 5 minutes over the current size of the tree. You can access it by adding
With the end goal for this work being detecting split-view attacks AND doing this offline, there's a few paths we can take:
I am quite bullish on the second approach, if that's not evident. :) First approach is nice as a proof of concept though, it'll likely be what I implement first. Hopefully that answers your questions @znewman01 and @kommendorkapten. Just to summarize:
Ack, that'll be the next thing to do if we're in agreement on making inclusion proofs mandatory.
AI for myself: I'll create this and confirm it with -js/-python/-java.
It's a good question. In practice, most logs are admitting entries into the log in seconds [citation needed, it was in a paper I can't recall]. Sigsum has taken the stance of never using promises and simply saying that signers must wait until an entry is admitted. So far, Rekor has not had issues with the approach of synchronous upload. It'd be a good exercise to do some load testing though.
Thank you @bdehamer! |
Just wanted to swing by and give a big +1 for moving away from "inclusion promises" to "inclusion proofs"! Hayden asked me to write up some of the background on SCTs which I did in this work-in-progress documentation for the Claimant Model in trillian#2980. Hopefully it provides a little more background here. Consider that CT was one of the first deployments of verifiable logs in an issuance workflow, and we've been running these logs for over 10 years now. The reasoning behind design choices from back then are valid to consider now, but we've learned a lot and improved a lot over that decade, and there now exist log operators that are willing to provide a stricter SLA for integration. |
Fun fact: The original design for CT did not include SCTs; you were asked to submit the entry to the log, and you'd receive either a checkpoint+inclusion proof, or a "come back in a bit and try again" response. SCTs were later added as a concession to (some) CAs who argued that any potential delay to their issuance pipeline was a threat to their business - back then, at least, the issuance pipeline was quite strongly serialised in those CAs, so adding, say, 15s to each issuance reduced the number of certs they could sell in a day. In the original CT log implementation(s) it was common (although also a fairly arbitrary choice) for logs to integrate and publish on an hourly cadence. There are definitely performance gains to be had from batching tree integrations, but I'm pretty sure you'll get "good enough" performance with a much shorter batching interval - this is essentially what Trillian-based CT logs are doing nowadays with batches of O(seconds) and CA issuance output at around 70 new entries/s on average. Here endeth the history lecture :) |
The log always generates inclusion proofs, so we will make it a requirement that clients verify the proof. Promises will be deprecated over time, but for now, we'll make them optional. Fixes sigstore#82 Ref sigstore/rekor#1566 Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>
* Require inclusion proofs, make promises optional The log always generates inclusion proofs, so we will make it a requirement that clients verify the proof. Promises will be deprecated over time, but for now, we'll make them optional. Fixes #82 Ref sigstore/rekor#1566 Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> * Bump version Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> * Update client verification requirements for promises Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> --------- Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>
Closing this issue, as the client specs have been updated to mandate inclusion proofs. One change I've made is that I've stopped referring to SETs as "promises". They're very similar, but the difference is a) Sigstore's ecosystem still requires SETs for verifying short-lived certificates in lieu of signed timestamps, and b) they're signed over a log index meaning that the log must have uploaded the entry. More accurately, it's a signed commitment from the log, but without a proof. |
For posterity's sake, mindersec/minder#2120 (comment) also documents the differences between SETs and proofs in a concise way. |
Background
Certificate Transparency (CT) (RFC 6962) defines a "Signed Certificate Timestamp" (SCT), a "promise to incorporate the certificate in the Merkle Tree within a fixed amount of time" [RFC]. SCTs are useful for two reasons:
However, SCTs are only a promise, not a proof. Unless the client verifies that the promise has been fulfilled and the certificate has been integrated into the log, the log can choose to not fulfill the promise. Verifying the promise and checking inclusion in the log requires an online lookup. This lets the log learn entries of interest to the client, which could reveal browser history. See "SoK: SCT Auditing in Certificate Transparency" for an overview of various proposals to audit promises while preserving privacy.
Rekor chose to implement promises too, "Signed Entry Timestamps" (SETs). SETs could be verified offline and would let Rekor implement batch processing. However, unlike CT implementations, Rekor chose to process entries immediately and block before returning a response to the client. This means that not only an SET is in the response, but also an inclusion proof. Inclusion proofs can be verified offline, so there is not a concern that verification will require online lookups through a centralized log.
Unfortunately, clients can choose to ignore the inclusion proof and verify only the SET. This might be because a client simply needs to verify a digital signature with an SET. Additionally, support for verifying inclusion proofs differs by language.
Note that either a signed inclusion timestamp or an RFC3161 signed timestamp is needed to verify short-lived Fulcio certificates, so we cannot get rid of the SET unless a verifiable timestamp is required.
Plan
a. This is a breaking change for clients, but since Rekor already returns an inclusion proof, it should be straightforward to require the proof in newly generated bundles.
a. Note that time should not come from the checkpoint. Clients may choose to bundle a stable checkpoint (where the checkpoint is updated periodically rather than on each entry upload), and this checkpoint could be outside of the validity window of a short-lived certificate.
In a future V2 iteration of Rekor, we will drop the SET entirely and only return inclusion proofs. This has a few consequences:
Additional Motivation
Privacy
There are a couple ways that a client can reveal to the log what they're interested in:
SETs provide privacy since verification is entirely offline, at the cost of only being a promise. Inclusion proofs can also be verified offline, but provide no protection against split-view attacks, which require consensus from a set of witnesses.
By distributing the inclusion proof rather than the SET, this gives the Believer (defined in Claimant Model) anonymity from the log since an inclusion proof does not have to be requested.
To mitigate the privacy concern with consistency proofs, clients should only request consistency proofs between stable checkpoints. These checkpoints batch a set of entries so that a Believer does not reveal a specific entry they're interested in. This provides k-anonymity.
Stronger Offline Verification
Verifying only an SET provides no guarantee that the log includes the entry. Verifying an inclusion proof is stronger, but there is no protection against split-view attacks.
One approach would be to bundle an inclusion proof with a set of witnessed checkpoints. This could be implemented in various ways. A client could wait until witnesses processed the last stable checkpoint. The tradeoff is that clients must wait O(minutes) before distributing their signed artifact. Another option would be to rely on artifact distributors (package repositories like npm, PyPI, Maven) to verify consistency and bundle witnessed checkpoints.
A follow-up document will go into more details on stronger offline verification.
The text was updated successfully, but these errors were encountered: