(WIP/Collaboration) rewrites for delegation graph issues #846

awwad · 2019-04-02T19:45:21Z

The core task here is to rewrite roledb, a central TUF module, and the code that uses it.

The primary motivation comes from issue #660, a confusion regarding roles and delegations that has led to a variety of issues in the code.

Additional motivation comes from these principles:

This is a reference implementation and code must be intuitive and readable.
Define once and use that definition.
Constrain TUF users (including systems like Uptane...) as little as possible. Expose simple APIs.

The result of this work should be a codebase that:

Does not confuse core TUF concepts like role and delegation.
Has deterministic verification of delegated role metadata.
Has only one internal structure for metadata. Any representation of metadata in memory should be in the basic structure that the metadata would take when written to disk. This means that the schemas used for roledb (or any in-memory representation of metadata) should avoid any extra fields and avoid being organized in a different way than the written metadata would. For example, this means that delegation information like paths and keyids go in the delegating role, not in the delegated-to role (as they now do).
Employs and validates data structures defined in tuf.formats and securesystemslib.formats.
Uses flexible, simple code like build_dict_conforming_to_schema (or that style of code) wherever reasonable.

awwad · 2019-04-02T19:48:06Z

@adityasaky

Plan of action for the next steps:

Step 1: Schema updates

So, I'm changing the schemas in tuf.formats that define data structures in the reference implementation. That's going to break lots of things. I'm almost done, and it's hard to assign something out before that, but I can show you what I have so far in a WIP commit I'll push to this branch in a few minutes and you can plan out changes to make following them. Additional changes might be necessary later, of course, but I expect to have the schemas done before I leave today. We can chat, though, too, and maybe some parallel task will emerge from something one of us thinks of. :)

Note that every little thing doesn't need to have its own schema. We want the schema definitions to be succinct and readable, so bouncing around a lot is not ideal.

Step 2: RoleDB Rewrite

After the new definitions, one of the main things that needs to happen is that someone needs to go over roledb to see how things need to change based on those schema changes, keeping in mind the stuff that motivated those schema changes (see top post in this PR)

Note that there are some comments / TODOs I added / am adding suggesting how things should work.

Step 3: updater and repository_lib edits

Updater and repository_lib will need to employ the new roledb code, which in some cases will result in refactoring in that code, and a few API changes. Repository_tool and some other modules may also require similar adjustments.

awwad · 2019-04-02T20:11:46Z

The most breaking thing in the last commit (draft of new schema definitions in formats.py) is the removal of ROLEDB_SCHEMA. That was was another internal representation for metadata (the one reorganized such that delegation information was stored under a delegatee). Instead, roledb should use ANYROLE_SCHEMA or the individual metadata roles' schemas as appropriate (e.g. TIMESTAMP_SCHEMA).

ROLES_SCHEMA is what will live in roledb._roledb_dict, one for each repository

awwad · 2019-04-02T20:14:55Z

Once we're all done, writing metadata will just involve canonicalization, serialization, and signing. It won't require rebuilding roles based on information listed under other roles, for example. That will then mean we can expose that internal representation in roledb (indirectly but pretty transparently) to users of TUF, to allow things like Uptane adding a Timeserver key to root metadata, then telling TUF to write and sign.

These aid in the roledb rewrite to start to address Issue #660. Also add two minor TODOs. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

- Rename and alter some schemas that really address delegations, to make that clear. - Do away with the ROLEDB_SCHEMA, an intermediate metadata format that is not necessary and which incorrectly flattens the delegation graph, and similar schemas. - Rewrite getters/setters in roledb to respect the delegation graph rather than assuming that delegated targets roles have only one delegation pointing to them (see Issue #660). - Add a variety of TODOs for later. - Clarify docstrings as a result of the above. reinterpreting metadata Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

This is mid-development, but I'm pushing it so that Aditya can see where things are and the general shape of things. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

awwad · 2019-04-03T14:42:19Z

Rebased after merge of #836.

Warn folks about the larger structures being near the end, make description a bit more readable, highlight matches() and check_match() funcs, emphasize that this module defines the data structures / formats used. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

- remove re-definition of rolename_schema - use securesystemslib.formats.PATH_SCHEMA for all paths, rather than using RELPATH_SCHEMA, which implies a distinction that we do not actually make, and checks we do not actually perform. - use INTEGER_NATURAL_SCHEMA for lengths and metadata versions Excludes fileinfo-related adjustments of the above, as those will follow in a separate fileinfo-specific commit. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

The misleadingly-named ROLE_SCHEMA was renamed to SIGNING_SCHEMA, but I'm now making it SIGNERS_SCHEMA, which I think is clearer. I also added an example. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

Failed to include key and value definitions. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

In TUF, we store information about files in a variety of ways. Sometimes, versions are used, and sometimes length and hashes are required. So FILEINFO_SCHEMA will match any of these three new schemas: FILEINFO_IN_TIMESTAMP_SCHEMA, FILEINFO_IN_SNAPSHOT_SCHEMA, and FILEINFO_IN_TARGETS_SCHEMA. This should be more intuitive than the former mess, I think. I also renamed TARGETINFO to LABELED_FILEINFO_SCHEMA, with an explanation. I hope that proves more intuitive as well. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

awwad · 2019-04-09T17:49:38Z

@adityasaky tells me that he has (on his branch to be PR'd here later) the basic roledb tests working after the schema changes now, so the next step will be looking through repository_lib to make the changes necessary there, guided by test failures. After that (or, possibly during, depending on how far repository_tool's issues extend into repository_lib), it'll be time to deal with repository_tool, and that's where this work will start to get pretty involved. The interface that repository_tool provides (e.g. repository.targets('role1').threshold = 2) will need substantial refactoring or API changes to deal with the fact that delegation properties like keyids, threshold, paths, etc. are properties of the delegating role, and specific to a delegation (identified by two roles, not one role).

After that, we can move on to the client side and updater, where verification code will need to be updated and the edge-case security issues related to #660 will be resolved. If the interface issues with repository_tool prove onerous to work around, it may be reasonable to deal with updater before repository_lib and repository_tool.

awwad · 2019-04-12T21:20:48Z

@adityasaky is working with updater first (that is, client side before repo side), so here's where the bulk of the load there lies:

updater has one primary mechanism for obtaining trustworthy target information for a particular target: Updater.get_one_valid_targetinfo(). That function's stack performs a pre-order depth-first walk down the delegation graph until it finds a trustworthy source of that target info, updating (and verifying) any necessary delegated targets role metadata along the way. The stack for obtaining and verifying a particular role -- which needs to be refactored, probably at another time -- is explained well in #841. In short, the stack, in seeking to update and verify the delegated targets role, eventually calls tuf.sig.verify, without passing the optional keyids and threshold arguments that it should pass, just expecting, instead, tuf.sig.verify() to figure out how to verify a delegated targets role, which it cannot reliably correctly do without knowing the delegating role.

That's where the fix for the primary part of #660 lies: providing tuf.sig.verify() with the right keyids and threshold. Updater.get_one_valid_targetinfo()'s stack (_preorder_depth_first_walk() in particular) can do that knows because it knows what delegation it's following to get to the delegated targets role it's updating and verifying.

I suggest that we leave the #841 flattening refactor for later, and just hold our breaths and add new optional keyids and threshold arguments the whole way down the stack from _preorder_depth_first_walk() to _get_metadata_file(). Then _get_metadata_file() can pass them to tuf.sig.verify(). Ultimately, #841 should clean this up later.

So that's your next task, @adityasaky. :)

I'll push to this PR some changes to tuf.sig.verify() that will make it exacting about when keyids and thresholds are passed to it. The only times verify() doesn't need keyids and thresholds are when it's verifying a top-level role, because then the information always comes from root.

Aside from this, there will doubtless be other, more minor changes, to match the updated schemas and renamed roledb functions, but I hope those are much easier.

trishankatdatadog · 2019-04-12T21:29:02Z

Great work, guys, keep it up!

It'll now be a public function used by other modules (tuf.sig), so make it public and improve the name (takes a rolename, not a role): roledb.is_top_level_rolename(). Also bugfix it to handle casing issues. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

It previously included information that wasn't really appropriate at this level of the code (about the project as a whole). Add short summary and list the two public functions with short explanations. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

- get_signature_status() and verify() will expect EITHER: - keyids AND threshold - a rolename never both. A rolename can be used in place of keyids and threshold only for top-level roles, and keyids and threshold will then be drawn from currently trusted Root metadata. See comments in the code for more. This includes making rolename an optional argument. Now uses tuf.roledb.is_top_level_rolename. - renamed "role" argument to "rolename", which is more correct (since the actual role metadata is another argument...) - Pulled the elaborate argument validation and the retrieval of keyids and threshold from Root metadata into a separate function: _determine_keyids_and_threshold_to_use - perform retrieval of keyids and threshold only from Root metadata, for top-level roles (part of #660) - removed unused generate_rsa_signature - cleaned up the structure of get_signature_status() a bit Tests will break and require fixes. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

awwad · 2019-04-15T21:06:43Z

@adityasaky

Pushed changes to tuf.sig that you can use for tuf.client.updater revisions. They'll be maximally annoying about arguments provided to verify() / get_signature_status() to make sure that the revised tuf.client.updater code is correct about how it uses them. (They'll expect keyids and threshold to be provided to them unless they're verifying top-level roles.)

Didn't fix test_sig.py here yet.

The keyids and threshold retrieval are already performed above now, so this lingering threshold retrieval is no longer needed. Move the comment about errors it would raise to where that actually would happen now (and refine comment given new functionality). Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

tuf.sig.generate_rsa_signature and tuf.sig.may_need_new_keys were not necessary and were deleted. This commit removes their tests. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

Explain the test conditions. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

awwad · 2019-04-16T16:23:06Z

@adityasaky
Those test_sig.py edits remove some unneeded stuff and add TODOs for after the roledb changes are merged into here. (The tests will require that roledb be functional.)

trishankatdatadog · 2019-04-16T16:24:24Z

Great job, guys, your hard work has not gone unnoticed 🙇

Would there be a simple API for users to verify all signatures on a loaded repository?

awwad · 2019-04-16T16:49:55Z

Would there be a simple API for users to verify all signatures on a loaded repository?

Such an API function would depend on this work, but not come as part of it. It's possible to write a function afterwards that loads a repo and verifies all top-level roles, then crawls down the full delegation graph and verifies every edge or reports any unverified edges. (Incidentally, that's the sort of thing that I'll want from a comprehensive all_targets function in the future, too.)

trishankatdatadog · 2019-04-16T16:58:27Z

Got it, thanks!

awwad · 2019-04-16T17:27:42Z

@adityasaky
You asked about where to get the keyids and threshold to pass down to tuf.sig.verify().

Note that _preorder_depth_first_walk() reads delegations and adds (relevant-to-the-target) roles that it finds there to its stack (LIFO) of roles to process as a result. Since it's reading the delegations, it simultaneously has access to the keyids and thresholds the delegating role expects, and that's the information you'll have to pass down through the function stack all the way to tuf.sig.verify() when you're trying to verify that delegated-to role.

Personally, I thought about a 3-mer queue (role, keyids, threshold) instead of a role queue, but there are a few ways to handle this....

Note that part of the task when updating something like this is considering the expectations in edge cases. For example:

Is there a traversal dictated by _preorder_depth_first_walk that results in keyids & threshold associated with a role when they're read not being the right ones when they're later processed? ((I don't think so.))
Is there something about the visited-roles system (intended to avoid cycles) that breaks when multiple delegations to the same role are considered? For example, if a role is considered visited when it fails to be verified along one delegation, will it be skipped if encountered along a different delegation (where a verification might succeed)? ((I don't think this is an issue because of what is supposed to happen when role verification fails, but see what you think.))

An error that is raised if someone tries to query a delegation that shouldn't exist (root to a delegated targets role or a delegated targets role to root, say) previously only described one direction, leading to misleading error messages. It now explains both possible causes of the error. Also removes a pdb.set_trace() left over from prior revisions. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

Two functions now exist to replace _test_rolename (which was a bit of a misleading name), and these are now used to perform argument testing for roledb functions that query a single role or query information about a delegation from one role to another role. In addition, tests for roledb.get_delegated_paths were also updated, duplicating some of the above for reasons explained in code comments. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

- remove duplicate ROLENAME_SCHEMA and ROLEDICT_SCHEMA - remove outdated and duplicate ROLE_SCHEMA Note that this is a quick fix that may be overridden with refactoring work in theupdateframework#660/theupdateframework#846. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>

joshuagl · 2020-09-10T11:32:51Z

Due to imminent refactor efforts this code is unlikely to merge.

awwad added 3 commits April 3, 2019 10:41

Add helper funcs for roledb: get_delegation, _is_top_level_role

fc7308a

These aid in the roledb rewrite to start to address Issue #660. Also add two minor TODOs. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

WIP, DO NOT MERGE -- quick sketch of new schemas in formats.py

6464f7a

This is mid-development, but I'm pushing it so that Aditya can see where things are and the general shape of things. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

awwad force-pushed the 660_respect_delegation_graph branch from 46e33f4 to 6464f7a Compare April 3, 2019 14:42

awwad added 6 commits April 3, 2019 16:29

PR revision: SIGNING_SCHEMA -> SIGNERS_SCHEMA

0aad781

The misleadingly-named ROLE_SCHEMA was renamed to SIGNING_SCHEMA, but I'm now making it SIGNERS_SCHEMA, which I think is clearer. I also added an example. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

PR revision: minor bugfix in ROLES_SCHEMA definition

42b14ef

Failed to include key and value definitions. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

DO NOT MERGE -- adds a TODO to roledb

a64fe91

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

awwad added 3 commits April 15, 2019 16:47

awwad added 4 commits April 16, 2019 10:33

Remove tests for deleted functions in tuf.sig

43750ba

tuf.sig.generate_rsa_signature and tuf.sig.may_need_new_keys were not necessary and were deleted. This commit removes their tests. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

minor: DOC: clarify comment in test_sig

cdae6b3

Explain the test conditions. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

DO NOT MERGE: add TODOs to test_sig for after roledb changes

e6660cb

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

adityasaky added 10 commits April 23, 2019 18:43

Update expected values and exceptions for _import_delegations

bb3bc15

Fix tuf.roledb reference

d893b65

Add expectedFailure to test for now

e9bd455

Add return values for keyids, threshold, update roledb method calls

844a2ce

Update roleinfo keys to reflect changes made in roledb

4e1f89a

Pass only rolename to sig.verify

b2d36c5

Add TODO for assertion fix

24daeaf

Add manual test for get_one_valid_targetinfo for non top-level targets

19ef4b9

Add TODO to remove redundant line

ab6df47

Remove stray pdb statement

cf8f93e

awwad mentioned this pull request Apr 29, 2019

External Signing API, or CCID/PIV Support #864

Closed

adityasaky and others added 7 commits April 30, 2019 12:58

Make style changes - line length

52d4923

Make style changes - line length, use global TEST_DATA_PATH var

f9c351a

Revert change to repo_tool: out of scope of roledb fixes

131afb8

Replace keyid tests with role_exists checks

0c7f8a1

Update check_match's comment for rolename format check

3c5d182

adityasaky mentioned this pull request Apr 30, 2019

Sub-PR to 846: Modification of Updater and tests as part of #660 #867

Closed

Merge branch '660_fix_roledb_tests' into 660_respect_delegation_graph

fc872d7

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>

adityasaky mentioned this pull request May 1, 2019

Sub-PR of 846: Modification of Updater and tests as part of #660 #868

Closed

awwad mentioned this pull request May 28, 2019

repository_tool: Root pre-verification during metadata writes misses possible verification failures #883

Closed

awwad mentioned this pull request Jul 23, 2019

Clean up updater metadata update stack #841

Closed

adityasaky mentioned this pull request Oct 3, 2019

Duplicate definitions of schemas #927

Closed

lukpueh mentioned this pull request Oct 4, 2019

Remove duplicate role-related schemas #929

Merged

3 tasks

awwad mentioned this pull request Dec 16, 2019

A tool to determine whether metadata for a targets role has diverged from its data #55

Closed

lukpueh mentioned this pull request Mar 19, 2020

Fix performance and metadata representation issues in roledb #1005

Closed

joshuagl closed this Sep 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(WIP/Collaboration) rewrites for delegation graph issues #846

(WIP/Collaboration) rewrites for delegation graph issues #846

awwad commented Apr 2, 2019 •

edited

Loading

awwad commented Apr 2, 2019

awwad commented Apr 2, 2019

awwad commented Apr 2, 2019

awwad commented Apr 3, 2019

awwad commented Apr 9, 2019 •

edited

Loading

awwad commented Apr 12, 2019 •

edited

Loading

trishankatdatadog commented Apr 12, 2019

awwad commented Apr 15, 2019

awwad commented Apr 16, 2019

trishankatdatadog commented Apr 16, 2019

awwad commented Apr 16, 2019 •

edited

Loading

trishankatdatadog commented Apr 16, 2019

awwad commented Apr 16, 2019

joshuagl commented Sep 10, 2020

(WIP/Collaboration) rewrites for delegation graph issues #846

(WIP/Collaboration) rewrites for delegation graph issues #846

Conversation

awwad commented Apr 2, 2019 • edited Loading

awwad commented Apr 2, 2019

Step 1: Schema updates

Step 2: RoleDB Rewrite

Step 3: updater and repository_lib edits

awwad commented Apr 2, 2019

awwad commented Apr 2, 2019

awwad commented Apr 3, 2019

awwad commented Apr 9, 2019 • edited Loading

awwad commented Apr 12, 2019 • edited Loading

trishankatdatadog commented Apr 12, 2019

awwad commented Apr 15, 2019

awwad commented Apr 16, 2019

trishankatdatadog commented Apr 16, 2019

awwad commented Apr 16, 2019 • edited Loading

trishankatdatadog commented Apr 16, 2019

awwad commented Apr 16, 2019

joshuagl commented Sep 10, 2020

awwad commented Apr 2, 2019 •

edited

Loading

awwad commented Apr 9, 2019 •

edited

Loading

awwad commented Apr 12, 2019 •

edited

Loading

awwad commented Apr 16, 2019 •

edited

Loading