Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Moving away from SETs #1566

Closed
haydentherapper opened this issue Jun 26, 2023 · 11 comments
Closed

Proposal: Moving away from SETs #1566

haydentherapper opened this issue Jun 26, 2023 · 11 comments
Labels
enhancement New feature or request

Comments

@haydentherapper
Copy link
Contributor

Background

Certificate Transparency (CT) (RFC 6962) defines a "Signed Certificate Timestamp" (SCT), a "promise to incorporate the certificate in the Merkle Tree within a fixed amount of time" [RFC]. SCTs are useful for two reasons:

  • By returning an SCT rather than an inclusion proof, a log can batch process entries and do not have to block the CA issuing a certificate.
  • SCTs can be verified offline without interacting with the log. Note that inclusion proofs can also be verified offline by bundling the inclusion proof with a checkpoint and leaf node.

However, SCTs are only a promise, not a proof. Unless the client verifies that the promise has been fulfilled and the certificate has been integrated into the log, the log can choose to not fulfill the promise. Verifying the promise and checking inclusion in the log requires an online lookup. This lets the log learn entries of interest to the client, which could reveal browser history. See "SoK: SCT Auditing in Certificate Transparency" for an overview of various proposals to audit promises while preserving privacy.

Rekor chose to implement promises too, "Signed Entry Timestamps" (SETs). SETs could be verified offline and would let Rekor implement batch processing. However, unlike CT implementations, Rekor chose to process entries immediately and block before returning a response to the client. This means that not only an SET is in the response, but also an inclusion proof. Inclusion proofs can be verified offline, so there is not a concern that verification will require online lookups through a centralized log.

Unfortunately, clients can choose to ignore the inclusion proof and verify only the SET. This might be because a client simply needs to verify a digital signature with an SET. Additionally, support for verifying inclusion proofs differs by language.

Note that either a signed inclusion timestamp or an RFC3161 signed timestamp is needed to verify short-lived Fulcio certificates, so we cannot get rid of the SET unless a verifiable timestamp is required.

Plan

  1. Update the protobuf bundle specification to require an inclusion proof.
    a. This is a breaking change for clients, but since Rekor already returns an inclusion proof, it should be straightforward to require the proof in newly generated bundles.
  2. Clients will need to handle bundles without the inclusion proof either by being permissive for some time, or dropping support for bundles without inclusion proofs.
  3. Time will still come from the SET unless an RFC3161 signed timestamp is provided.
    a. Note that time should not come from the checkpoint. Clients may choose to bundle a stable checkpoint (where the checkpoint is updated periodically rather than on each entry upload), and this checkpoint could be outside of the validity window of a short-lived certificate.

In a future V2 iteration of Rekor, we will drop the SET entirely and only return inclusion proofs. This has a few consequences:

  • It will be required that clients provide signed timestamps rather than trust Rekor's timestamps. This is already the direction that we wanted to go since Rekor's internal clock is not externally verifiable.
  • Rekor will not be able to batch process a large number of entries since it will block until the entry is included in the log.

Additional Motivation

Privacy

There are a couple ways that a client can reveal to the log what they're interested in:

  • Requesting an inclusion proof, which directly reveals what entry the client is interested in
  • Requesting a consistency proof. With the current design of Rekor, there is effectively a 1-1 mapping between entry and the latest checkpoint, so there is no anonymity when a client requests a consistency proof, even if they have an inclusion proof, since they reveal the checkpoint or log size associated with the inclusion proof they have.

SETs provide privacy since verification is entirely offline, at the cost of only being a promise. Inclusion proofs can also be verified offline, but provide no protection against split-view attacks, which require consensus from a set of witnesses.

By distributing the inclusion proof rather than the SET, this gives the Believer (defined in Claimant Model) anonymity from the log since an inclusion proof does not have to be requested.

To mitigate the privacy concern with consistency proofs, clients should only request consistency proofs between stable checkpoints. These checkpoints batch a set of entries so that a Believer does not reveal a specific entry they're interested in. This provides k-anonymity.

checkpoint-verify-kanon

Stronger Offline Verification

Verifying only an SET provides no guarantee that the log includes the entry. Verifying an inclusion proof is stronger, but there is no protection against split-view attacks.

One approach would be to bundle an inclusion proof with a set of witnessed checkpoints. This could be implemented in various ways. A client could wait until witnesses processed the last stable checkpoint. The tradeoff is that clients must wait O(minutes) before distributing their signed artifact. Another option would be to rely on artifact distributors (package repositories like npm, PyPI, Maven) to verify consistency and bundle witnessed checkpoints.

A follow-up document will go into more details on stronger offline verification.

@haydentherapper haydentherapper added the enhancement New feature or request label Jun 26, 2023
@haydentherapper
Copy link
Contributor Author

haydentherapper commented Jun 26, 2023

@woodruffw @FiloSottile @kommendorkapten @bdehamer @AlCutter @bobcallaway @znewman01: I'd appreciate your feedback on this proposal, and please tag anyone else you think might be interested in the topic.

@woodruffw
Copy link
Member

Thanks for the tag! I think this is a great idea, and resolves a big chunk of ambiguity on the client implementing side (i.e. "why do we verify both the SET and inclusion proof, and what's the difference between verifying the two in {online,offline} contexts?")

As a clarifying point: when we're talking about verifying the inclusion proof, we're talking about both the Merkle inclusion proof and the associated checkpoint signature, right? My understanding is that both are necessary to produce a proof that's as convincing (in an offline setting) as the SET, since without the checkpoint signature the Merkle proof is "just" a non-cryptographic self-consistency proof.

This is a breaking change for clients, but since Rekor already returns an inclusion proof, it should be straightforward to require the proof in newly generated bundles.

Agreed -- sigstore-python is already including the inclusion proof in its generated bundles, as well as opportunistically using any bundle-included proof. It shouldn't be challenging for other clients to do the same.

or dropping support for bundles without inclusion proofs.

This would be my personal preference, but I recognize that might cause more compatibility heartburn than is worth it 🙂.

Another possible option here is to add some normative language saying that bundles without inclusion proofs MUST be handled by doing an online inclusion proof lookup, with an additional failure mode if the user explicitly requests offline verification. That would minimize breakage except for in cases where a user explicitly asks for something unsupported, which seems pretty reasonable to me.

@haydentherapper
Copy link
Contributor Author

haydentherapper commented Jun 26, 2023

As a clarifying point: when we're talking about verifying the inclusion proof, we're talking about both the Merkle inclusion proof and the associated checkpoint signature, right? My understanding is that both are necessary to produce a proof that's as convincing (in an offline setting) as the SET, since without the checkpoint signature the Merkle proof is "just" a non-cryptographic self-consistency proof.

Correct, the inclusion proof must be accompanied by a checkpoint. The bundle format requires this.

For some extra info, one thing I glossed over at the end of this issue is whether this checkpoint is the "latest" checkpoint (created immediately after entry inclusion) or a "stable" checkpoint (periodic checkpoint that is co-signed by witnesses). I discussed that more in a doc on witnessing.

This would be my personal preference, but I recognize that might cause more compatibility heartburn than is worth it 🙂.

Let's see what sigstore-js and sigstore-java are doing currently. I think sigstore-js is storing the proof too (https://github.com/sigstore/sigstore-js/blob/0f9915027ea1d1c0d05ffd4948d8f01a1f3f2c99/packages/client/src/types/sigstore/serialized.ts#L33-L40) - Is this correct @bdehamer? Java also looks to be storing it - https://github.com/sigstore/sigstore-java/blob/8f4794c6cfc8571d16243f393e343a25cccd7470/sigstore-java/src/main/java/dev/sigstore/bundle/BundleFactoryInternal.java#L129

Another possible option here is to add some normative language saying that bundles without inclusion proofs MUST be handled by doing an online inclusion proof lookup, with an additional failure mode if the user explicitly requests offline verification. That would minimize breakage except for in cases where a user explicitly asks for something unsupported, which seems pretty reasonable to me.

I think this would be a reasonable trade-off, though one downside is the privacy aspect of revealing entries of interest to the log operator.

@kommendorkapten
Copy link
Member

Great so see progress here, nice work @haydentherapper !

For some extra info, one thing I glossed over at the end of this issue is whether this checkpoint is the "latest" checkpoint (created immediately after entry inclusion) or a "stable" checkpoint (periodic checkpoint that is co-signed by witnesses).

Yes, this has to be clearly defined. Based on earlier discussions, I believe that a periodic checkpoint would be preferable as it simplifies monitors to gossip about the state. And this makes we wonder, if the checkpoint is signed at time T, and the inclusion proof is made at time T + 1, it's not offline verifiable, right? Or is the idea that the checkpoint will be created and signed at time T + 1 (i.e time of inclusion in the log), but Rekor would "publish" stable checkpoints periodically for consumption for monitors, to make it easy to gossip (that is, all inclusions generates a checkpoint, but monitors only verifies using the "published" ones)? If so, was there an idea on how this "publish" mechanism would happen? (I haven't read the doc on witnessing yet).

Let's see what sigstore-js ... Is this correct @bdehamer
Currently no:

$  curl -s https://registry.npmjs.org/-/npm/v1/attestations/sigstore@1.6.0 | jq | grep inclusionProof
              "inclusionProof": null,
              "inclusionProof": null,

Canonicalization

Something that I don't think this issue addresses is the issue we are facing with canonicalization of entries? As the entry that is added to the Merkle tree is the JSON canonicalized representation, which has been proven to be hard (non deterministic for certain types) to implement, and that's why the bundle contains the canonicalized entry (which we want to get away from), see this doc and this issue for earlier discussions.

If we perform a big change like this, I would suggest also making sure that we improve/simplify the leaf creation prior to inserting it into the log so it's deterministic.

@znewman01
Copy link

Overall, this change SGTM.


My main concern is that it should be compatible with future protections against split-view attacks. Sounds like you're keeping that in mind! But maybe fleshing out the "stronger witness verification" plan and how it interacts with this change would assuage some of the concerns. No need for a full doc at this point, just:

  1. argue that the inclusion proof in the bundle is forwards-compatible with witness verification
  2. describe client workflows to check against an old signed head, update the signed head, and check against the current signed head with witness verification

Also make sure we update the Client Spec (which it sounds like you're already thinking about).


Sounds like we're figuring out the migration story. I think I'd like to see a table for each implementation indicating what they're producing and what they're consuming at the moment.


Finally, would we ever need to go back to SETs? There are real performance/consistency reasons that CT chooses to use SCTs and have a long merge time.

@bdehamer
Copy link

I'll get sigstore-js updated before our next release so that we start producing bundles with the inclusion proof (I think I omitted this only cause it was optional in the protobuf spec and we hadn't yet implemented verification of the proof).

@haydentherapper
Copy link
Contributor Author

haydentherapper commented Jun 29, 2023

was there an idea on how this "publish" mechanism would happen

We now publish a stable/periodic checkpoint, which is produced every 5 minutes over the current size of the tree. You can access it by adding stable=true to rekor.sigstore.dev/api/v1/log.

if the checkpoint is signed at time T, and the inclusion proof is made at time T + 1, it's not offline verifiable, right
argue that the inclusion proof in the bundle is forwards-compatible with witness verification
describe client workflows to check against an old signed head, update the signed head, and check against the current signed head with witness verification

With the end goal for this work being detecting split-view attacks AND doing this offline, there's a few paths we can take:

  • Signers are responsible for gathering the co-signed checkpoints, which means signers would have to wait until a) a periodic checkpoint is published, and b) witnesses co-sign and distribute checkpoints. The inclusion proof would need to be computed at the tree size of the periodic checkpoint (Trillian has an API for this, we just don't expose this in Rekor). Otherwise, like you said @kommendorkapten, you'd also need a consistency proof between checkpoint at T and inclusion proof at T+1.
    • Note that you don't need to include any consistency proofs alongside the witnessed checkpoints because we trust witnesses to have computed consistency proofs.
  • Signers only publish an inclusion proof and checkpoint (the latest, not periodic), and we involve package repositories to be responsible for detecting split-view attacks. There must be some set of entities in an ecosystem that monitor for split view attacks, but it's not a requirement that every consumer of a bundle do the same calculation. See the witnessing doc for more info. This approach has a lot of benefits:
    • What happens if the signer's set of trusted witnesses differs from the consumer's set of trusted witnesses? This is not a concern if the witnessed checkpoints aren't distributed alongside the inclusion proof.
    • Privacy - If a consumer ever has to query a log for an inclusion proof, they reveal the entry of interest to the log. If they ever use a non-periodic checkpoint, they also reveal the entry (or maybe ~5 entries) of interest, because there's more or less a 1-1 mapping. Even if they use the periodic checkpoint, that's k-anonymity at best (the diagram above). And if not the log, then they reveal entries of interest to the witnesses. Package repositories querying the log and witnesses/distributors effectively act as an anonymizing proxy.
    • Package repos are well-poised to alert the ecosystem if a split-view attack happens.
    • Package repos can tier the verification status - No proof is untrusted, inclusion proof from the latest checkpoint is tier 1, proof over witnessed checkpoint is tier 2.
    • Package repos could also distribute a bundle with included witnessed checkpoints, but I haven't thought through if this is really needed. It helps to minimize trust in the repos to prove the repos actually are verifying consistency.

I am quite bullish on the second approach, if that's not evident. :) First approach is nice as a proof of concept though, it'll likely be what I implement first. Hopefully that answers your questions @znewman01 and @kommendorkapten. Just to summarize:

  • In all cases, inclusion proofs/checkpoints are a part of the bundle.
  • Consistency proofs are not a part of the bundle, but witnessed checkpoints would be.

Also make sure we update the Client Spec (which it sounds like you're already thinking about).

Ack, that'll be the next thing to do if we're in agreement on making inclusion proofs mandatory.

Sounds like we're figuring out the migration story. I think I'd like to see a table for each implementation indicating what they're producing and what they're consuming at the moment.

AI for myself: I'll create this and confirm it with -js/-python/-java.

Finally, would we ever need to go back to SETs? There are real performance/consistency reasons that CT chooses to use SCTs and have a long merge time.

It's a good question. In practice, most logs are admitting entries into the log in seconds [citation needed, it was in a paper I can't recall]. Sigsum has taken the stance of never using promises and simply saying that signers must wait until an entry is admitted. So far, Rekor has not had issues with the approach of synchronous upload. It'd be a good exercise to do some load testing though.

I'll get sigstore-js updated before our next release so that we start producing bundles with the inclusion proof (I think I omitted this only cause it was optional in the protobuf spec and we hadn't yet implemented verification of the proof).

Thank you @bdehamer!

@mhutchinson
Copy link

Just wanted to swing by and give a big +1 for moving away from "inclusion promises" to "inclusion proofs"! Hayden asked me to write up some of the background on SCTs which I did in this work-in-progress documentation for the Claimant Model in trillian#2980. Hopefully it provides a little more background here. Consider that CT was one of the first deployments of verifiable logs in an issuance workflow, and we've been running these logs for over 10 years now. The reasoning behind design choices from back then are valid to consider now, but we've learned a lot and improved a lot over that decade, and there now exist log operators that are willing to provide a stricter SLA for integration.

@AlCutter
Copy link

Finally, would we ever need to go back to SETs? There are real performance/consistency reasons that CT chooses to use SCTs and have a long merge time.

Fun fact: The original design for CT did not include SCTs; you were asked to submit the entry to the log, and you'd receive either a checkpoint+inclusion proof, or a "come back in a bit and try again" response.

SCTs were later added as a concession to (some) CAs who argued that any potential delay to their issuance pipeline was a threat to their business - back then, at least, the issuance pipeline was quite strongly serialised in those CAs, so adding, say, 15s to each issuance reduced the number of certs they could sell in a day.

In the original CT log implementation(s) it was common (although also a fairly arbitrary choice) for logs to integrate and publish on an hourly cadence. There are definitely performance gains to be had from batching tree integrations, but I'm pretty sure you'll get "good enough" performance with a much shorter batching interval - this is essentially what Trillian-based CT logs are doing nowadays with batches of O(seconds) and CA issuance output at around 70 new entries/s on average.

Here endeth the history lecture :)

haydentherapper added a commit to haydentherapper/protobuf-specs that referenced this issue Jun 30, 2023
The log always generates inclusion proofs, so we will make it a
requirement that clients verify the proof. Promises will be deprecated
over time, but for now, we'll make them optional.

Fixes sigstore#82
Ref sigstore/rekor#1566

Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>
kommendorkapten pushed a commit to sigstore/protobuf-specs that referenced this issue Jul 3, 2023
* Require inclusion proofs, make promises optional

The log always generates inclusion proofs, so we will make it a
requirement that clients verify the proof. Promises will be deprecated
over time, but for now, we'll make them optional.

Fixes #82
Ref sigstore/rekor#1566

Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>

* Bump version

Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>

* Update client verification requirements for promises

Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>

---------

Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>
@haydentherapper
Copy link
Contributor Author

Closing this issue, as the client specs have been updated to mandate inclusion proofs.

One change I've made is that I've stopped referring to SETs as "promises". They're very similar, but the difference is a) Sigstore's ecosystem still requires SETs for verifying short-lived certificates in lieu of signed timestamps, and b) they're signed over a log index meaning that the log must have uploaded the entry. More accurately, it's a signed commitment from the log, but without a proof.

@haydentherapper
Copy link
Contributor Author

haydentherapper commented Feb 16, 2024

For posterity's sake, mindersec/minder#2120 (comment) also documents the differences between SETs and proofs in a concise way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants