-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update specification #433
Update specification #433
Conversation
Remove note about priority tag schemes and our investigation of several of these schemes
We previously used a full rolename when delegations resembled a tree
@trishankkarthik Can you please review this pull request for correctness? There is more work to be done on the specification, but I'd like to update it gradually. Smaller pull requests is also a goal. I will begin by adding missing features introduced by Diplomat, Mercury, etc., and will later edit the entire thing to assist adopters (the current text can be improved to remove ambiguities). |
@vladimir-v-diaz Sure thing, yeah, as soon as I get some downtime! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Vlad, these are my comments. Let me know if you'd like to discuss in person!
optionally be compressed with this algorithm. Compressed versions of | ||
metadata are not listed in snapshot.json. | ||
|
||
CONSISTENT_SNAPSHOT is a boolean indicating whether the repository supports |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the consequences of this boolean value being either True or False? Does the metadata change? Do clients download files differently? How does it change how the repository administrators delete metadata and packages?
@@ -530,6 +532,14 @@ Version 1.0 (Draft) | |||
, ... } | |||
} | |||
|
|||
COMPRESSION_ALGORITHM specifies one of the compression algorithms supported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the compression algorithms specified here? What are the benefits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Listing the compression algorithms used by the repository removes the need to list all compressed versions of metadata in Snapshot, reducing its size and minimizing code complexity. Should the benefits of listing the compression algorithm(s) be stated in the specification, or are you simply curious about the change?
There were a few other reasons, which are captured in an email thread with David Lawrence @endophage (see email thread with title, "role names"). I'll quote some part of his request to remove compressed metadata from Snapshot, etc.
I feel very strongly that data compression should be implemented at layer 6 and TUF should not be validating compressed files. As is, a malicious publisher could intentionally publish decompression bomb. The TUF user would download and validate the hash and size of the compressed file but be vulnerable when they decompress. By implementing compression at layer 6, the client will decompress the stream as it arrives and never read more than Length bytes from the incoming stream.
If we were talking about GBs of data there would be a strong argument for compression on disk, but TUF data is (and necessarily so) in the order of KB and the storage gains from compression are minimal. Compression at layer 6 still provides the benefit of reduced network transfer.
...
Overall, I think consistently not using file extensions when referencing TUF metadata within the metadata files, and specifying any compression is done at layer 6 overall is a security improvement on 2 fronts. Firstly it makes the metadata more consistent, reducing code complexity, which makes auditing easier and chance of bugs lower. Secondly, it better protects the user against decompression bombs. Note that in the Python implementation the code attempts to read the entire gzipped file and as such is vulnerable to a decompression bomb: utils.py:934: fileobject = six.StringIO(gzip.open(filepath).read().decode('utf-8’))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @JustinCappos, I'm also not sure that this change is straightforward.
This might have implications for snapshot metadata. There are use cases where, besides version numbers, the hashes of targets and delegated targets metadata files might be listed in the snapshot metadata. Do we include the hashes of both the compressed and uncompressed metadata files in there?
docs/tuf-spec.txt
Outdated
"targets/projects" | ||
|
||
ROLENAME is the name of the delegated role. For example, | ||
"projects". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So does this mean that a ROLENAME must only be alphanumeric? How does this new ROLENAME format change how metadata files are placed on the repository? Is this format change backwards-compatible with old clients?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Section 3.1.2.1 explains how delegated metadata is placed on the repository.
This change is certainly backwards-incompatible, but so are many of the changes in this version 1.0 of the specification (e.g., terminating delegations, version numbers in Snapshot, etc.)
I'm not sure that we can set restrictions on the rolename, which must ultimately match the filename of the role as saved to the underlying file system. Should we state in the specification that we do not place restrictions on rolenames, and supported characters in rolenames are left to adopters?
docs/tuf-spec.txt
Outdated
terminating delegations instruct the client not to consider future trust | ||
statements that match the delegation's pattern. This stops the delegation | ||
processing once this delegation (and its descendants) have been processed. | ||
(Handling this case is conceptually similar to the use of the cut operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can safely remove the parenthetical note, most people won't care :)
docs/tuf-spec.txt
Outdated
shell-style wildcards. For example, the path pattern "targets/*.tgz" would | ||
match file paths "targets/foo.tgz" and "targets/bar.tgz", but not | ||
"targets/foo.txt". Likewise, path pattern "foo-version-?.tgz" matches | ||
foo-version-2.tgz", but not "foo-version-alpha.tgz". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more matching example here would help. Right now, it looks as if "?" matches only "numbers".
docs/tuf-spec.txt
Outdated
order to accommodate this scheme, the "roles" key in the DELEGATIONS object | ||
above points to an array, instead of a hash table, of delegated roles. | ||
|
||
Another scheme would have the clients prefer the delegated role with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this paragraph. It's an interesting, historical footnote, but has nothing to do with how TUF works now.
docs/tuf-spec.txt
Outdated
"targets/foo.txt". Likewise, path pattern "foo-version-?.tgz" matches | ||
foo-version-2.tgz", but not "foo-version-alpha.tgz". | ||
|
||
Several schemes exist to resolve conflicts between delegated roles that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we please remove the mention of "several schemes", and simply note that we use prioritized delegations ala Diplomat to resolve conflicts? Thanks!
docs/tuf-spec.txt
Outdated
@@ -1004,8 +1028,8 @@ Version 1.0 (Draft) | |||
required, the root.json file is versioned and accessible by version number, | |||
e.g. 3.root.json. Clients update the set of trusted root keys by requesting | |||
the current root.json and all previous root.json versions, until one is | |||
found that has been signed by keys the client already trusts. This is to | |||
ensure that outdated clients remain able to update, without requiring all | |||
found that has been signed by keys the client already trusts. This is to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good change! If I'm not mistaken, CoreOS proposed this change, no? If so, let's credit them here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks is now given (in the PR summary) to Evan and the CoreOS team. In a separate edit to the specification, a "contributions" section on features that have been contributed by others will be added to the specification.
I think thanks is also given to the author(s) in the pull request that implements the feature.
61d04eb
to
40aaf93
Compare
in text for terminating delegations.
One more matching example here would help. Right now, it looks as if "?" matches only "numbers".
@trishankkarthik I have gone through all of your comments. Please give the pull request, or my tacked-on commits, another pass when you have the time. |
I'm slightly confused about why this change is showing up here. Wouldn't
this require a TAP? Was this added prior to 1.0?
…On Mon, Apr 24, 2017 at 12:09 PM, Vladimir Diaz ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In docs/tuf-spec.txt
<#433 (comment)>
:
> @@ -530,6 +532,14 @@ Version 1.0 (Draft)
, ... }
}
+ COMPRESSION_ALGORITHM specifies one of the compression algorithms supported
Listing the compression algorithms used by the repository removes the need
to list all compressed versions of metadata in Snapshot, reducing its size
and minimizing code complexity. Should the benefits of listing the
compression algorithm(s) be stated in the specification, or are you simply
curious for the change?
There were a few other reasons, which are captured in an email thread with
David Lawrence @endophage <https://github.com/endophage> (see email
thread with title, "role names"). I'll quote some part of his request to
remove compressed metadata in Snapshot, etc.
I feel very strongly that data compression should be implemented at layer
6 and TUF should not be validating compressed files. As is, a malicious
publisher could intentionally publish decompression bomb. The TUF user
would download and validate the hash and size of the compressed file but be
vulnerable when they decompress. By implementing compression at layer 6,
the client will decompress the stream as it arrives and never read more
than Length bytes from the incoming stream.
If we were talking about GBs of data there would be a strong argument for
compression on disk, but TUF data is (and necessarily so) in the order of
KB and the storage gains from compression are minimal. Compression at layer
6 still provides the benefit of reduced network transfer.
...
Overall, I think consistently not using file extensions when referencing
TUF metadata within the metadata files, and specifying any compression is
done at layer 6 overall is a security improvement on 2 fronts. Firstly it
makes the metadata more consistent, reducing code complexity, which makes
auditing easier and chance of bugs lower. Secondly, it better protects the
user against decompression bombs. Note that in the Python implementation
the code attempts to read the entire gzipped file and as such is vulnerable
to a decompression bomb: utils.py:934: fileobject =
six.StringIO(gzip.open(filepath).read().decode('utf-8’))
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#433 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA0XD3rGeaQXYe4rnmTcWsMJzI-HtC3Cks5rzMkkgaJpZM4MNpVN>
.
|
Trishank and I went over my comments. We need to verify whether |
The change to exclude compressed version of metadata from Snapshot was discussed (via email with the Docker folks) and implemented before the TAP process was adopted. |
I have excluded the |
compression_algorithms
,terminating
, andconsistent_snapshot
attributesThanks is given to @ecordell (Evan Cordell) and the CoreOS team for their contribution to chained Root files.