Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update specification #433

Merged
merged 10 commits into from
Jun 2, 2017
Merged

Update specification #433

merged 10 commits into from
Jun 2, 2017

Conversation

vladimir-v-diaz
Copy link
Contributor

@vladimir-v-diaz vladimir-v-diaz commented Feb 27, 2017

  • Add compression_algorithms, terminating, and consistent_snapshot attributes
  • Update text describing the format of PATHPATTERN
  • Update text for handling overlapping targets between delegations
  • Remove notion of a full rolename

Thanks is given to @ecordell (Evan Cordell) and the CoreOS team for their contribution to chained Root files.

@vladimir-v-diaz
Copy link
Contributor Author

@trishankkarthik Can you please review this pull request for correctness? There is more work to be done on the specification, but I'd like to update it gradually. Smaller pull requests is also a goal. I will begin by adding missing features introduced by Diplomat, Mercury, etc., and will later edit the entire thing to assist adopters (the current text can be improved to remove ambiguities).

@coveralls
Copy link

Coverage Status

Coverage remained the same at 97.936% when pulling 6665d25 on update_specification into 40aaf93 on develop.

@trishankkarthik
Copy link
Contributor

@vladimir-v-diaz Sure thing, yeah, as soon as I get some downtime!

Copy link
Contributor

@trishankkarthik trishankkarthik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Vlad, these are my comments. Let me know if you'd like to discuss in person!

optionally be compressed with this algorithm. Compressed versions of
metadata are not listed in snapshot.json.

CONSISTENT_SNAPSHOT is a boolean indicating whether the repository supports
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the consequences of this boolean value being either True or False? Does the metadata change? Do clients download files differently? How does it change how the repository administrators delete metadata and packages?

@@ -530,6 +532,14 @@ Version 1.0 (Draft)
, ... }
}

COMPRESSION_ALGORITHM specifies one of the compression algorithms supported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the compression algorithms specified here? What are the benefits?

Copy link
Contributor Author

@vladimir-v-diaz vladimir-v-diaz Apr 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Listing the compression algorithms used by the repository removes the need to list all compressed versions of metadata in Snapshot, reducing its size and minimizing code complexity. Should the benefits of listing the compression algorithm(s) be stated in the specification, or are you simply curious about the change?

There were a few other reasons, which are captured in an email thread with David Lawrence @endophage (see email thread with title, "role names"). I'll quote some part of his request to remove compressed metadata from Snapshot, etc.

I feel very strongly that data compression should be implemented at layer 6 and TUF should not be validating compressed files. As is, a malicious publisher could intentionally publish decompression bomb. The TUF user would download and validate the hash and size of the compressed file but be vulnerable when they decompress. By implementing compression at layer 6, the client will decompress the stream as it arrives and never read more than Length bytes from the incoming stream.

If we were talking about GBs of data there would be a strong argument for compression on disk, but TUF data is (and necessarily so) in the order of KB and the storage gains from compression are minimal. Compression at layer 6 still provides the benefit of reduced network transfer.

...

Overall, I think consistently not using file extensions when referencing TUF metadata within the metadata files, and specifying any compression is done at layer 6 overall is a security improvement on 2 fronts. Firstly it makes the metadata more consistent, reducing code complexity, which makes auditing easier and chance of bugs lower. Secondly, it better protects the user against decompression bombs. Note that in the Python implementation the code attempts to read the entire gzipped file and as such is vulnerable to a decompression bomb: utils.py:934: fileobject = six.StringIO(gzip.open(filepath).read().decode('utf-8’))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like @JustinCappos, I'm also not sure that this change is straightforward.

This might have implications for snapshot metadata. There are use cases where, besides version numbers, the hashes of targets and delegated targets metadata files might be listed in the snapshot metadata. Do we include the hashes of both the compressed and uncompressed metadata files in there?

"targets/projects"

ROLENAME is the name of the delegated role. For example,
"projects".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does this mean that a ROLENAME must only be alphanumeric? How does this new ROLENAME format change how metadata files are placed on the repository? Is this format change backwards-compatible with old clients?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Section 3.1.2.1 explains how delegated metadata is placed on the repository.

This change is certainly backwards-incompatible, but so are many of the changes in this version 1.0 of the specification (e.g., terminating delegations, version numbers in Snapshot, etc.)

I'm not sure that we can set restrictions on the rolename, which must ultimately match the filename of the role as saved to the underlying file system. Should we state in the specification that we do not place restrictions on rolenames, and supported characters in rolenames are left to adopters?

terminating delegations instruct the client not to consider future trust
statements that match the delegation's pattern. This stops the delegation
processing once this delegation (and its descendants) have been processed.
(Handling this case is conceptually similar to the use of the cut operator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can safely remove the parenthetical note, most people won't care :)

shell-style wildcards. For example, the path pattern "targets/*.tgz" would
match file paths "targets/foo.tgz" and "targets/bar.tgz", but not
"targets/foo.txt". Likewise, path pattern "foo-version-?.tgz" matches
foo-version-2.tgz", but not "foo-version-alpha.tgz".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more matching example here would help. Right now, it looks as if "?" matches only "numbers".

order to accommodate this scheme, the "roles" key in the DELEGATIONS object
above points to an array, instead of a hash table, of delegated roles.

Another scheme would have the clients prefer the delegated role with the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this paragraph. It's an interesting, historical footnote, but has nothing to do with how TUF works now.

"targets/foo.txt". Likewise, path pattern "foo-version-?.tgz" matches
foo-version-2.tgz", but not "foo-version-alpha.tgz".

Several schemes exist to resolve conflicts between delegated roles that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please remove the mention of "several schemes", and simply note that we use prioritized delegations ala Diplomat to resolve conflicts? Thanks!

@@ -1004,8 +1028,8 @@ Version 1.0 (Draft)
required, the root.json file is versioned and accessible by version number,
e.g. 3.root.json. Clients update the set of trusted root keys by requesting
the current root.json and all previous root.json versions, until one is
found that has been signed by keys the client already trusts. This is to
ensure that outdated clients remain able to update, without requiring all
found that has been signed by keys the client already trusts. This is to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good change! If I'm not mistaken, CoreOS proposed this change, no? If so, let's credit them here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks is now given (in the PR summary) to Evan and the CoreOS team. In a separate edit to the specification, a "contributions" section on features that have been contributed by others will be added to the specification.

I think thanks is also given to the author(s) in the pull request that implements the feature.

@awwad awwad force-pushed the develop branch 2 times, most recently from 61d04eb to 40aaf93 Compare March 22, 2017 16:11
in text for terminating delegations.
One more matching example here would help. Right now, it looks as if "?" matches only "numbers".
@coveralls
Copy link

Coverage Status

Coverage remained the same at 97.936% when pulling 3532fb8 on update_specification into 40aaf93 on develop.

@vladimir-v-diaz
Copy link
Contributor Author

@trishankkarthik I have gone through all of your comments. Please give the pull request, or my tacked-on commits, another pass when you have the time.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 97.936% when pulling 7eae7e3 on update_specification into 40aaf93 on develop.

@JustinCappos
Copy link
Member

JustinCappos commented Apr 24, 2017 via email

@vladimir-v-diaz
Copy link
Contributor Author

vladimir-v-diaz commented Apr 25, 2017

Trishank and I went over my comments. We need to verify whether
(1) the compression algorithms can be excluded from the Snapshot file
(2) restrictions should be set for rolenames. For example, exclude unicode

@vladimir-v-diaz
Copy link
Contributor Author

I'm slightly confused about why this change is showing up here. Wouldn't
this require a TAP? Was this added prior to 1.0?

The change to exclude compressed version of metadata from Snapshot was discussed (via email with the Docker folks) and implemented before the TAP process was adopted.

@vladimir-v-diaz
Copy link
Contributor Author

vladimir-v-diaz commented Jun 2, 2017

Trishank and I went over my comments. We need to verify whether
(1) the compression algorithms can be excluded from the Snapshot file
(2) restrictions should be set for rolenames. For example, exclude unicode

I have excluded the compression algorithm field from Root.json, and will open an issue for (2). Merging!

@vladimir-v-diaz vladimir-v-diaz merged commit 8ddc62c into develop Jun 2, 2017
@coveralls
Copy link

Coverage Status

Coverage remained the same at 97.936% when pulling dae2b7a on update_specification into c1e72e0 on develop.

@vladimir-v-diaz vladimir-v-diaz deleted the update_specification branch July 11, 2017 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants