Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TAP for keyid flexibility #112

Merged
merged 10 commits into from
Apr 21, 2020
275 changes: 275 additions & 0 deletions candidate-keyid-tap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,275 @@
* TAP: TBD
* Title: Improving keyid flexibility
* Version: 1.0.0
* Last-Modified: 30-03-2020
* Author: Marina Moore
* Status: Draft
* Content-Type: markdown
* Created: 18-03-2020
* TUF-Version: 1.1.0
* Post-History: 25-03-2020

# Abstract

Keyids are used in TUF metadata as shorthand references to identify keys. They
are used in place of keys in metadata to assign keys to roles and to identify
them in signature headers. The TUF specification requires that every keyid used
in TUF metadata be calculated using a SHA2-256 hash of the public key it
represents. This algorithm is used elsewhere in the TUF specification and so
provides an existing method for calculating unique keyids. Yet, such a rigid
requirement does not allow for the deprecation of SHA2-256. A security flaw in
SHA2-256 may be discovered, so TUF implementers may choose to deprecate this
algorithm. If SHA2-256 is deprecated in TUF, it should no longer be used to
mnm678 marked this conversation as resolved.
Show resolved Hide resolved
calculate keyids. Therefore TUF should allow more flexibility in how keyids are
determined. To this end, this TAP proposes a change to the TUF specification
that would remove the requirement that all keyids be calculated using SHA2-256.
Instead, the specification will allow metadata owners to use any method for
calculating keyids as long as each one is unique within the metadata file in
which it is defined to ensure a fast lookup of trusted signing keys. This
change will allow for the deprecation of SHA2-256 and will give metadata owners
flexibility in how they determine keyids.


# Motivation

Currently, the TUF specification requires that keyids must be the SHA2-256 hash
of the public key they represent. This algorithm ensures that keyids are unique
within a metadata file (and indeed, throughout the implementation) and creates a
short, space-saving representation. SHA2-256 also offers a number of secure
hashing properties, though these are not necessary for these purposes. In this
case SHA2-256 is simply a way to calculate a unique identifier employing an
algorithm that is already in use by the system.

The specification sets the following requirements for keyid calculation:
1. "The KEYID of a key is the hexdigest of the SHA-256 hash of the canonical JSON form of the key."
2. "Clients MUST calculate each KEYID to verify this is correct for the associated key."
3. "Clients MUST ensure that for any KEYID...only one unique key has that KEYID."

## Problems with this requirement
Mandating that keyids be calculated using SHA2-256 has created a number of issues
for some implementations, such as:
* Lack of consistency in implementations that use other hash algorithms for
calculating file hashes and would prefer not to introduce SHA2-256 for this one
instance. For example, the PEP 458 implementation (https://python.zulipchat.com/#narrow/stream/223926-pep458-implementation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: I'd either point directly to that conversation https://python.zulipchat.com/#narrow/stream/223926-pep458-implementation/topic/Timeline/near/188666946 (given that the link is permanent-ish, which I don't know), or leave the link out altogether.

will use the BLAKE2 hashing algorithm throughout the implementation.
* Incompatibility with some smart cards and PGP implementations that have their
own way of calculating keyids.
* Inability to adapt if SHA2-256 should be deprecated. In such a case, metadata
owners may decide that maintaining a deprecated algorithm for use in keyid
calculation does not make sense.
* Space concerns may require even shorter hashes than those SHA2-256 can generate,
such as an index.
In these and other cases, TUF should provide a metadata file owner with the
flexibility to use keyids that are not calculated using SHA2-256.

# Rationale

TUF uses keyids as shorthand references to identify which keys are trusted to
sign each metadata file. As they eliminate the need to list the full key every
time, they take up less space in metadata signatures than the actual signing
key, reducing bandwidth usage and download times.

The most important quality of keyids used in TUF is their uniqueness. To be
effective identifiers, all keyids defined within a metadata file must be unique.
For example, a root file that delegates trust to root, snapshot, timestamp, and
top-level targets should provide unique keyids for each key trusted to sign
metadata for these roles. By doing so, a client may check metadata signatures
in O(1) time by looking up the proper key for verification.

Failing to provide unique keyids can have consequences for both functionality
and security. These are a few attacks that are possible when keyids are not unique:
* **Invalid Signature Verification**: A client may lookup the wrong key to use
in signature verification leading to an invalid signature verification error,
even if the signature used the correct key.
* **Keyid collision**: If root metadata listed the same keyid K for different
snapshot and root keys, an attacker with access to the snapshot key would also
be able to sign valid root metadata. Using the snapshot key to sign root
metadata, the attacker could then list the signature in the header with K. A
client verifying the signature of this root metadata file, would use K to
lookup a key trusted to sign root, and would find the snapshot key and
continue the update with the malicious root metadata. To prevent this
privilege escalation attack, metadata file owners should ensure that
every keyid is associated with a single key in each metadata file.
One attack that does not need to be considered is a hash collision. Though an
attacker who is able to exploit a hash collision against the function used to
calculate the keyid will be able to identify another key that hashes to the
value of the keyid, the client will only use a key that is listed in the
metadata. The attacker would not be able to put a malicious key into the
metadata without the metadata signing key, so a hash collision cannot be used
to maliciously sign files.

# Specification

With just a few minor changes to the current TUF specification process, we can
remove the requirement that keyids must be calculated using SHA2-256. First, the
specification wording should be updated to allow the metadata owner to calculate
keyids using any method that produces a unique identifier within the metadata
file. This means replacing requirements 1 and 2 above with a description of
required keyid properties, ie “The KEYID is an identifier for the key that is
determined by the metadata owner and MUST be unique within the root or
delegating targets metadata file.” Once this keyid is determined by the metadata
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The KEYID is an identifier for the key that is determined by the metadata owner and MUST be unique within the root or delegating targets metadata file.

Now that there wouldn't be a cryptographically derived keyid for each key, it's now possible to list the same key multiple times, each with a different keyid. Can we instead impose another rule that the key must also be unique? Without this rule, an attacker could duplicate a signature to reduce the effective threshold of keys needed to validate the metadata. We do this in go-tuf and rust-tuf to avoid double counting keys due to our (temporary) support of python-tuf's keyid_hash_algorithms field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. If there are multiple delegations to the same KEYID as part of a threshold, is this allowed? If I delegate to both Alice and Bob, who both delegate to Charlie, is Charlie's approval enough? (I think I know the answer in both cases and this is mostly tangental to this issue, but think we should discuss if we think TUF is doing the right thing.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a description of this in the scenarios below. As currently written, in this situation Charlie's approval would be enough. I think that this should be sufficient to trust the metadata as a threshold of signatures using unique keys must be reached to obtain Charlie's approval.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have less practical experience with tuf than the other folks in this discussion, but I would be surprised that Charlie's approval would be enough. My understanding of the specification is that thresholds are a defence against key compromise. In the scenario described we only have to compromise a single key in order to meet a threshold that is > 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like the "Deadly Diamond of TUF" 😜

Jokes aside, I also think that Charlie's approval is enough. But in reality it required Justin's, Alice's and Bob's approval to even end up in a situation, where it only requires Charlie's approval.

I agree that it doesn't sound ideal. So either:

  • Justin reconsiders delegating trust to Alice and Bob, seeing them void the threshold,
  • or he at least does not allow them to further delegate trust (IIUC, this is what the "terminating" flag in delegations is for).
  • Alternatively, we could add a feature to the TUF specification, to disable such signature threshold regression in delegation graphs. But I think above tools are enough.

Do others agree? @JustinCappos, you really got me curious about your answers to these cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requirement: Justin requires two roles in agreement about a given target
branch search a): Justin asks Alice, Alice asks Charlie, Charlie has target, back up to Justin
branch search b): Justin asks Bob, Bob asks Charlie, Charlie has target, back up to Justin
success: Justin has two roles in agreement and releases the target

Shouldn't it be enough to, along with Charlies knowledge about the target, also pass up the public key and let Justin only release the target if two roles with different keys were in agreement?

The current algorithm returns with the first match. So suppose that Alice (or Bob) also delegated trust to Daniela, who also approved the target. You should approve it in this case, but the current algorithm would not do so. You'd need to match all parties that agree. Then you'd need to be sure that Alice and Bob each have at least one person that was not yet chosen who they could select to fulfill this need. Also, what if Alice and Bob have threshold delegations to Charlie and Daniela in this case? It's just a lot messier to deal with, in my view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this issue pertains more to the existing delegation resolution algorithm than to the keyid calculation, so I propose we open a new issue to discuss this issue further. As @lukpueh mentioned, TAP 3 is not currently included in the specification, so this issue does not exist with the current specification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a closer look at this issue, it appears that the visited flag ensures that Charlie's role will not be visited more than once in the DFS as described in 4.4.1 of the client workflow. Does this solve the issue @JustinCappos @lukpueh?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this solves it. I don't think it would handle the case above with Daniela, for example.

I do think it would be fine to move this elsewhere, but I think the most important question is what should the system actually do? I think we could code it to work however, but I think we should definitely strive to do what a user would expect first and foremost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I created a new issue to continue this discussion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

within the root or delegating targets metadata file

Just for clarity, can we generalize this early on?

A role that authorizes the signing keys (by keyid) for another role and has its public keys == a role delegating trust to another role == a delegating role == root or delegating targets role (including top-level targets).

You do generalize it in the next line ...

will be listed in the delegating metadata file

where, I think, you are referring to root and delegating targets, right?

Anyway, it's not a big thing. I just want to make sure everyone is on the same page.

owner using their chosen method, it will be listed in the delegating metadata
file and in all signatures that use the corresponding key. When parsing metadata
signatures, the client would use the keyid(s) listed in the signature header to
find the key(s) that are trusted for that role in the delegating metadata. This
should be described in the specification by replacing requirement 3 above with
“Clients MUST use the keyids from the delegating role to look up trusted signing
keys to verify signatures.” To ensure that a client does not use the same key
to verify a signature more than once, they must additionally check that all keys
applied to a signature threshold are unique. So, the specification should
additionally require that "Clients MUST use each key only once during a given
signature verification." All metadata definitions would remain the same, but
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So @mnm678 and I discussed this quite a bit, and I seem to remember that we agreed that a client-side key or keyid uniqueness check within a delegation only prevents scenarios, where the repository owner already screwed up, i.e. uses the same keyid for different keys, or different keyids for one key.

Or am I missing something?

If not, I wonder if we really want to make client verification more complex, because the client can't trust repositories to do their job right. I mean, the client also doesn't check for faux-pas like the delegation scenario, Justin described above, or for a repository using the same key to sign root and timestamp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of different uniqueness requirements that the keyid/key systems needs. I agree that where possible we want to simplify the client workflow by trusting information found in signed metadata files. The situations in which uniqueness is needed include:

  • keyids provide a mapping that can be used to determine which key is trusted. As @lukpueh and I discussed in the above link, this mapping is provided in signed metadata, and so can be trusted by the client without additional verification.
  • Duplicate keys cannot be applied to the same threshold. If an attacker is able to append a duplicate signature, the client should only apply this signature to the threshold once. In this case, I think that the client can do a O(1) check (eg by adding each key to a set) to ensure uniqueness. Previously this was done by ensuring the keyid was only used once, but using the key instead seems like a less error-prone way to make this check (that solves some current issues).
  • In a multi-role delegation each role's approval is only used once. This is the situation discussed in the comments above. I think that like the keyid/key mapping, this situation can only come up if the delegation tree allows two roles to delegate to the same role. Afaik there is not a way for an attacker without a threshold of keys to create this situation, so it should be left up to the repository manager to occasionally audit to ensure the delegation tree makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the extra clarification, @mnm678! This makes sense to me. Regarding the third item, I left some thoughts in the discussion above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we keep track of duplicate public keys? I mean, what if there are the same public keys, but listed in different formats? (I know we use only PEM right now, but humor me.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess that if multiple formats are supported, they would need to be converted to a consistent format for signature verification, so the uniqueness check would take place after any conversion.

Are there any use cases in which multiple formats would be used but not be able to be converted? In this case we might need to have some more discussion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But once again (and thanks for pointing this out, @SantiagoTorres), if the attacker does not control the metadata that lists the public keys, then they cannot make different representations of the same key each count towards the threshold.
And if the attacker does control the metadata, they can add any key they want and uniqueness checks on the client are moot.

This means that we really only need to de-duplicate on the one key representation in the delegating metadata. Unless, and this brings me back to my initial comment above, we don't trust the delegating role.

Sorry for adding to the confusion with my comment above, I was briefly confused too. Maybe we should re-hash the threat model for this document.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was more interested in someone (e.g., the delegating role) trying to deliberately fool python-tuf into accepting what are actually duplicate keys by using different public key formats which actually describe the same public key parameters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but this delegating role could also just change the threshold? My impression is that this is necessary because we want to avoid footguns, but it's not a very exploitable attack vector.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but it's still something to think about, I think, especially if the client-side checks are not expensive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unique key check already exists to ensure that an attacker cannot append duplicate signatures (although this requirement is not explicitly stated in the specification). The main difference here is requiring that the client do this uniqueness check on the key instead of the keyid. If all goes correctly, using the key does not make any difference, but if the metadata accidentally has duplicate keys with different keyids, using keyids to verify uniqueness of keys could lead to a key being applied to a threshold more than once.

the client’s verification process would track keyids within each metadata file
instead of globally.

In order for TUF clients to adhere to these specification changes, they may have
to change the way they store and process keyids. Clients will use the keyids
from a metadata file only for all delegations defined in that metadata file. So
if a targets metadata file T delegates to A and B, the client should verify the
signatures of A and B using the trusted keyids in from T. When verifying
mnm678 marked this conversation as resolved.
Show resolved Hide resolved
signatures, clients should try all signatures that match their trusted keyid(s).
If T trusts keyid K to sign A’s metadata, the client should check all
signatures in A that list a keyid of K. This means that if another metadata file
M delegates to A, it would be able to use the same keyid with a different key.
However, clients must ensure that duplicate keys are not applied to the same
signature threshold. To do so, they must additionally keep track of the keys
used to verify a signature. Once the signatures for A and B have been checked,
the client no longer needs to store the keyid mapping listed in T. During the
preorder depth-first search of targets metadata, the keyids from each targets
metadata file should be used in only that stage of the depth-first search.

These changes to the specification allow the repository to use any scheme to
assign keyids (not just SHA2-256) without needing to communicate it to clients. By making this
scheme independent of the client implementation, root and targets metadata may
use different methods to determine keyids, especially if they are managed by
different people (ie TAP 5). In addition, the repository may update the scheme
at any time to deprecate a hash algorithm or change to a different keyid
calculation method.

## Keyid Deprecation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand why this is needed anymore. Just generate a new file with whatever new KEYID scheme you want. All consumers didn't understand how you got the KEYIDs anyways, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process is pretty straightforward, but the metadata owner should ensure that the keyids remain unique throughout any transition. This might not need a whole section, but I think that it's important to note.

With the proposed specification changes, the method used to determine keyids
is not only more flexible, but it may be deprecated using the following process
for each key D and keyid K in the root or delegating targets metadata file:
* The owner of the metadata file determines a new keyid L for D using the new method.
* In the next version of the metadata file, the metadata owner replaces K with L
in the keyid definition for D.
* Any files previously signed by D should list L as the keyid instead of K.
These files do not need to be resigned as only the signature header will be updated.
Once this process is complete, the metadata owner is using a new method to
determine the keyids used by that metadata file.

As keyid deprecation is executed, it is important that keyids within each
metadata file remain unique. Metadata owners should only publish metadata that
contains a unique keyid to key mapping.

## Implications for complex delegation trees
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You touch upon this in the specification, but it might be helpful to also include an example here for the case where the targets metadata file A delegates to C using the key D and the keyid K, as well as another targets metadata B delegates to C using the key E and the keyid K. From what I understand, this should result in C being signed with:

{
    "sigs": [
        {
             "keyid": "K",
             "sig": "1234..."
        },
        {
             "keyid": "K",
             "sig": "abcd..."
        },
        ...
    ],
    ...
}

Both signatures will be checked if the delegation search ever ends up at A or B.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks for the suggestion!

Although keyids need to be unique within each metadata file, they do not need to
be unique for each delegated role. It is possible for different keyids to
represent the same key in different metadata files, even if both metadata files
delegate to the same role. Consider two delegated targets metadata files A and B
that delegate to the same targets metadata file C. If A delegates to C with
key D with keyid K and B delegates to C with key D with keyid L, the signature
header of C will contain the following:
```
{
"sigs": [
{
"keyid": "K",
"sig": "abcd..."
},
{
"keyid": "L",
"sig": "abcd..."
},
...
],
...
}
```
These delegations can be processed during the preorder depth-first search of
targets metadata as follows:
* When the search reaches A, it will look for a signature with a keyid of K in C.
If it finds this and validates it, the search will continue if a threshold of
signatures has not been reached.
* When the search reaches B, it will look for a signature with a keyid of L in C.
If it finds this and validates it, the search will continue if a threshold of
signatures has not been reached.
Once the search is complete, if a threshold of signatures is reached the
metadata in C will be used to continue the update process. Therefore, K and L
may be used as keyids for D in different metadata files. As clients store keyids
only for use in the current delegation, this should not require a change to the
client process described in this document.

In this case, the same key D is used to verify C twice, once when A delegates to
C and once when B delegates to C. As the key is only used once during each
delegation, this does not violate the client verification of key uniqueness
described in this TAP. If the keyids L and K were both used in the same
delegation (say A delegating to C), then these signatures would only contribute
a single valid signature to the threshold due to the client verification.

It is also possible for the same keyid to represent different keys in different
metadata files. Consider a targets metadata file A that delegates to C with key
D and keyid K and a targets metadata file B that delegates to C with key E and
keyid K. The signature header of C will contain the following:
```
{
"sigs": [
{
"keyid": "K",
"sig": "abcd..."
},
{
"keyid": "K",
"sig": "efgh..."
},
...
],
...
}
```

During the depth-first search of targets metadata, the client will process these
delegations as follows:
* When the search reaches A, it will look for a signature with a keyid of K in
C. It will discover two signatures with the given keyid, and will check each
using the key D. If either passes the verification, the search will continue
if a threshold of signatures has not been reached.
* When the search reaches B, it will look for a signature with a keyid of K in
C. It will discover two signatures with the given keyid, and will check each
using the key E. If either passes the verification, the search will continue
if a threshold of signatures has not been reached.
Using this process, the same keyid may be used across metadata files without
being associated with the same key.

In both of these cases, the client must ensure that signatures using the same
key are not applied to the same threshold. To do so, clients must keep track
of the keys used during signature verification to ensure that there is a
threshold of unique keys that have signed the metadata.

# Security Analysis

TUF clients only trust keys that are defined in signed metadata files. For this
reason, the method of calculating keyids does not allow an attacker to add
new trusted keys to the system. However, a bad keyid scheme could allow a
privilege escalation in which the client verifies one metadata file with a
key from a role not trusted to sign that metadata file. This proposal prevents
privilege escalation attacks by requiring that metadata owners use unique keyids
within each metadata file, as described in the rationale.

# Backwards Compatibility

Metadata files that are generated using SHA2-256 will be compatible with clients
that implement this change. However, clients that continue to check that
keyids are generated using SHA2-256 will not be compatible with metadata that
uses a different method for calculating keyids.

For backwards compatibility, metadata owners may choose to continue to use
SHA2-256 to calculate keyids.

# Augmented Reference Implementation

TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be finished, ofc 🙂


# Copyright

This document has been placed in the public domain.