-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce the idea of trusted/untrusted snapshot #1605
Introduce the idea of trusted/untrusted snapshot #1605
Conversation
Pull Request Test Coverage Report for Build 1336473453Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the commit/PR title is misleading (ans so are some of the comments): the snapshot rollback checks require local snapshot, and do not work without local snapshot. What you are doing is allowing the loading of local snapshot as intermediate snapshot even if the hashes do not match (because, like meta version, the meta hashes reference the new remote snapshot). We want to load the local snapshot even though we already know it's not valid as final because the local snapshot allows us to do rollback checks on the new remote snapshot.
What might be worth mentioning is that the hash checks for metadata files are not essential to TUF security guarantees: they are just an additional layer of security that allows us to avoid even parsing json that could be malicious (we already know the malicious metadata would be stopped at metadata verification after the parsing).
It's a complex situation and because of the complexity we absolutely must be able to clearly express what happens. If we are not able to do that then we should avoid implementing the metafile hashing altogether...
eb425f1
to
fa6c0c5
Compare
@jku I tried to address all of your comments. Second, I experimented with adding hashes and length calculations in the Repository simulator. Finally, I updated the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this actually runs the hash computing code? Can you double check with --dump
or debug prints or something?
the docstrings are indeed hard, I'll maybe try to suggest something when I'm more caffeinated... otherwise left two code suggestions.
fa6c0c5
to
c7f1581
Compare
Ohh... I didn't run it because I was assigning I think I addressed your comments.
Will wait for your other review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would maybe move the compute_metafile_hashes_length = True
to the start of the test: we should try to make tests that don't modify the repository in the middle of the test in unrealistic ways and here it seems we should be testing against a repository that includes hashes for all meta versions, not just a single snapshot update.
Can you make sure the client testing plan (document or issue) includes this new repository setting as a configuration we should keep in mind when improving tests (so that this doesn't remain the single test using this configuration)
I left some specific comments, the important ones are the new bug in RepositorySimulator and how the hash computation is still more complex than it needs to be
As a review of my original RepositorySimulator design: This use case shows a (sort of) disadvantage of signing on demand: since the meta hashes require signatures to be included in the hash, the snapshot update now requires signing every targets metadata (which in practice means that files will typically be signed twice per refresh(): once to create snapshot, once to serve the actual file). I think signing on demand is still probably a reasonable/correct approach: manually signing sounds like a lot of work we don't want to do in the tests |
c7f1581
to
578aba0
Compare
Agreed. Changed that.
Yes, left a comment here: #1579 (comment)
I think I addressed all of your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks correct.
The case really is kind of a corner case of a corner case so very hard to explain clearly. @joshuagl does this look reasonable to you?
578aba0
to
4e1d06a
Compare
Liked your version a little more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me, thank you both.
tuf/ngclient/updater.py
Outdated
@@ -360,7 +360,7 @@ def _load_snapshot(self) -> None: | |||
version = snapshot_meta.version | |||
|
|||
data = self._download_metadata("snapshot", length, version) | |||
self._trusted_set.update_snapshot(data) | |||
self._trusted_set.update_snapshot(data, trusted=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._trusted_set.update_snapshot(data, trusted=False) | |
self._trusted_set.update_snapshot(data) |
Nit: we defined a default, should we just use it? Is being explicit here a style recommendation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though we have a default value I prefer if we are explicit about this.
That way it's easier to distinguish when we consider the data as trusted
and when we do not.
In this case, I see the default value more as a defense mechanism that trusted
will be set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not requiring any change here. But if we want to force ourselves to be explicit about how we call this internal API, why have the default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think @jku? Should we have a default value for trusted
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't care very much, but defining a default in internal API and then explicitly avoiding using that default as defense mechanism seems weird. So I'd say use the default or don't define it in the first place... (but it's really not a required change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked and if I decide to change the code to mandate trusted
it will have too many other unrelated changes, so the tests could pass. It doesn't worth it.
I the argument from this function call as Joshua suggested.
4e1d06a
to
32b47c9
Compare
If you do the following steps: 1. call Updater.refresh() and load, verify and cache all metadata files 2. modify timestamp snapshot meta information: (One or more of hashes or length for snapshot changes here) 3. call Updater.refresh() again 4. root and timestamp will be updated to their latest versions 5. local snapshot will be loaded, but hashes/length will be different than the ones in timestamp.snapshot_meta and that will prevent loading 6. remote snapshot is loaded and verification starts then when executing step 6 the rollback checks will not be done because the old snapshot was not loaded on step 5. In order to resolve this issue, we are introducing the idea of trusted and untrusted snapshot. Trusted snapshot is the locally available cached version. This version has been verified at least once meaning hashes and length were already checked against timestamp.snapshot_meta hashes and length. That's why we can allow loading a trusted snapshot version even if there is a mismatch between the current timestamp.snapshot_meta hashes/length and hashes/length inside the trusted snapshot. Untrusted snapshot is the one downloaded from the web. It hasn't been verified before and that's why we mandate that timestamp.snapshot_meta hashes and length should match the hashes and legth calculated on this untrusted version of snapshot. As the TrustedMetadataSet doesn't have information which snapshot is trusted or not, so possibly the best solution is to add a new argument "trusted" to update_snapshot. Even though this is ugly as the rest of the update functions doesn't have such an argument, it seems the best solution as it seems to work in all cases: - when loading a local snapshot, we know the data has at some point been trusted (signatures have been checked): it doesn't need to match hashes now - if there is no local snapshot and we're updating from remote, the remote data must match meta hashes in timestamp - if there is a local snapshot and we're updating from remote, the remote data must match meta hashes in timestamp Lastly, I want to point out that hash checks for metadata files are not essential to TUF security guarantees: they are just an additional layer of security that allows us to avoid even parsing json that could be malicious - we already know the malicious metadata would be stopped at metadata verification after the parsing. Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Add an option to calculate the hashes and length for timestamp/snapshot meta. This will help to cover more use cases with the repository simulator. Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Modify RepositorySimulator function delegates() to all_targets(), so that all targets can be traversed and updated with one cycle when calling update_snapshot() (which is the only use case for now for delegates()). Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
32b47c9
to
717eef9
Compare
The changes I made:
|
I am puzzled why the windows checks failed. |
let's try again a little later |
Fixes #1523
Description of the changes being introduced by the pull request:
If you do the following steps with a repository that contains hashes in meta dicts:
(One or more of hashes or length for snapshot changes here)
then the ones in timestamp.snapshot_meta and that will prevent loading
then when executing step 6 the rollback checks will not be done because
the old snapshot was not loaded on step 5.
In order to resolve this issue, we are introducing the idea of trusted and
untrusted snapshot.
Trusted snapshot is the locally available cached version. This version has
been verified at least once meaning hashes and length were already checked
against timestamp.snapshot_meta hashes and length.
That's why we can allow loading a trusted snapshot version even if there is a
mismatch between the current timestamp.snapshot_meta hashes/length and
hashes/length inside the trusted snapshot.
Untrusted snapshot is the one downloaded from the web. It hasn't been verified
before and that's why we mandate that timestamp.snapshot_meta hashes and length
should match the hashes and legth calculated on this untrusted version of
snapshot.
As the TrustedMetadataSet doesn't have information on which snapshot is trusted or
not, so possibly the best solution is to add a new argument "trusted"
to update_snapshot.
Even though this is ugly as the rest of the update functions doesn't
have such an argument, it seems the best solution as it seems to work
in all cases:
trusted (signatures have been checked): it doesn't need to match hashes
now
remote data must match meta hashes in timestamp
data must match meta hashes in timestamp
Lastly, I want to point out that hash checks for metadata files are not
essential to TUF security guarantees: they are just an additional layer of
security that allows us to avoid even parsing json that could be malicious -
we already know the malicious metadata would be stopped at metadata
verification after the parsing.
Signed-off-by: Martin Vrachev mvrachev@vmware.com
Please verify and check that the pull request fulfills the following
requirements: