Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: Add repository tutorial based on metadata API #1685

Merged
merged 2 commits into from
Nov 29, 2021

Conversation

lukpueh
Copy link
Member

@lukpueh lukpueh commented Nov 18, 2021

[EDIT 11/23/2021: remove draft status and update PR description]
[EDIT 11/25/2021: remove reviewer question about test, the PR includes one]

Fixes #1673

Description

Add an excessively commented python script to demonstrate repository creation/operation using only the low-level metadata API.

Context
As 'repository_tool' and 'repository_lib' are being deprecated, repository metadata must to be created and maintained manually using the low-level Metadata API. The example code in these files shall serve as temporary replacement until a new repository tool is available.

Contents

  • creation of top-level metadata
  • target file handling
  • consistent snapshots
  • key management
  • top-level delegation and signing thresholds
  • target delegation
  • in-band and out-of-band metadata signing
  • writing and reading metadata files
  • root key rotation

Questions for reviewers

  • Is this readable enough? Should this rather be a markdown document?
  • Is it too broad? Should we move some usage scenarios to different files?
    E.g. hashed bin delegation will be in a separate file (see follow-up PR)
  • How can we demonstrate validity? With a TrustedMetadata example?
  • How can we demonstrate usage? With a client example follow-up?

Please verify and check that the pull request fulfills the following requirements:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

Copy link
Contributor

@sechkova sechkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the refreshment exercise :) I left some comments on the text itself.
I don't think the example is too broad, having everything at one place seems like a good idea. Maybe in time, more detailed examples can be added for some trickier cases (complex delegations, key rotations ...).


# Common fields
# -------------
# All roles have the same metadata container format. On the top-most level,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can somehow point to the Metadata class here?

# from which a client would download the target file.
target_local_path = os.path.abspath(__file__)
target_file_path = os.path.basename(__file__)
target_file_info = TargetFile.from_file(target_file_path, target_local_path)
Copy link
Contributor

@sechkova sechkova Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no good suggestion for improvement but these paths seem confusing. I admit I had to read it twice even though I was supposed to know what is going on 😁 Maybe an example path and/vs URL could help.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that how it confusing. I'll try to better explain/exemplify those two paths.

# The timestamp role guarantees freshness of the repository metadata. It does
# so by listing the latest snapshot (which in turn lists all the latest
# targets) metadata. Choosing a short expiration interval, requires clients to
# often re-download timestamp and thus immediately detect if new target files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the client always try to download timestamp?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha! So much for "fully internalized spec". Thanks for catching this. I'll reword.

# Just assume we do out-of-band signing for keys we don't have
for key in [keys["root"], another_root_key] + new_root_keys:
roles["root"].sign(SSlibSigner(key), append=True)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the part "signing with a threshold of old keys AND a threshold of the new keys" needs an emphasis.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➕ I absolutely agree.

@coveralls
Copy link

coveralls commented Nov 23, 2021

Pull Request Test Coverage Report for Build 1515912733

  • 69 of 69 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.1%) to 97.578%

Totals Coverage Status
Change from base Build 1503123861: 0.1%
Covered Lines: 4029
Relevant Lines: 4113

💛 - Coveralls

@lukpueh lukpueh changed the title Docs: Add repository tutorial based on metadata API (WIP) Docs: Add repository tutorial based on metadata API Nov 23, 2021
@lukpueh lukpueh marked this pull request as ready for review November 23, 2021 14:05
@lukpueh
Copy link
Member Author

lukpueh commented Nov 23, 2021

Thanks for your review, @sechkova! I just finalised the example texts and squashed all commits into one. Your comments should be addressed in another commit on top of that for re-review convenience, i.e. 2fc5945.

I'd appreciate another look. Also see the updated PR description, especially the "Questions for reviewers".

@lukpueh
Copy link
Member Author

lukpueh commented Nov 25, 2021

I just added very basic testing that runs the example script using exec and checks if the metadata files were created as expected. It currently only checks for file existence, but it wouldn't be hard to add more thorough inspection of the created files.

Note that the test class is intended to be re-used for a similar example script that I have in the pipe.

Copy link
Contributor

@sechkova sechkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice if more people provide feedback but you have my approval 😁
I also filled in the questionnaire.

Is this readable enough? Should this rather be a markdown document?

No strong opinion here, I'm ok with this one

Is it too broad? Should we move some usage scenarios to different files?
E.g. hashed bin delegation will be in a separate file (see follow-up PR)

Nope, I like it,

How can we demonstrate validity? With a TrustedMetadata example?

Seems like the options are either a full Updater with a server or using TrustedMetadata by providing it with the metadata files as bytes. The second option sounds more interesting to me :)

How can we demonstrate usage? With a client example follow-up?

I have nothing better to propose.

examples/repo_example/basic_repo.py Outdated Show resolved Hide resolved
examples/repo_example/basic_repo.py Show resolved Hide resolved
examples/repo_example/basic_repo.py Outdated Show resolved Hide resolved
Copy link
Member

@joshuagl joshuagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @lukpueh, beautiful work – very clear example and a testament to the Metadata API. 🎉

Is this readable enough? Should this rather be a markdown document?

To me this is very readable, appreciate the detail and care here.

Is it too broad? Should we move some usage scenarios to different files?

LGTM as is. Covers the common cases, I think.

How can we demonstrate validity? With a TrustedMetadata example?

Implicit in the below? And future work once the proposed repository API validation piece is implemented? i.e. I think this is a solid submission as-is.

How can we demonstrate usage? With a client example follow-up?

Could be a simple follow-on that runs a http server over the data generated here and uses ngclient to fetch the python file?

examples/repo_example/basic_repo.py Outdated Show resolved Hide resolved
examples/repo_example/basic_repo.py Outdated Show resolved Hide resolved
examples/repo_example/basic_repo.py Outdated Show resolved Hide resolved
# from within other metadata, and thus allows for repository consistency in
# addition to protecting against rollback attacks.
#
# The expiry date, protects against freeze attacks and allows for implicit key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistency:

Suggested change
# The expiry date, protects against freeze attacks and allows for implicit key
# The 'expiry' date protects against freeze attacks and allows for implicit key

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name is 'expires'. Should we write "The 'expires' date... "? Does consistency beat grammar? :D
Regardless, I'll remove the stray comma.

Copy link
Member

@joshuagl joshuagl Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 of course.

I think it is better to bind the comments to the variables, maybe:
"The date the metadata 'expires'..." ?

examples/repo_example/basic_repo.py Outdated Show resolved Hide resolved
examples/repo_example/basic_repo.py Outdated Show resolved Hide resolved
tests/test_examples.py Outdated Show resolved Hide resolved
tests/test_examples.py Outdated Show resolved Hide resolved
tests/test_examples.py Show resolved Hide resolved
tests/test_examples.py Outdated Show resolved Hide resolved
Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think this is fine as the stop gap: it's demonstrating the basic elements that a repository needs to handle, without pretending to actually be a repository implementation.

My reaction to the "is it too broad" question is that we shouldn't put a lot of effort into expanding or splitting it (although the hashed bin example should be useful), if we can push the repository library and its example code (or a repository tool) forward instead...

The repository validation question is more interesting... but even there maybe the answer isn't to expand this tutorial code .

On the usage question: this could be a follow-up issue that is solved when initial versions of both this and the client example are present. Possible solution:

  • document how to serve the generated directory over http
  • add a "bootstrap from this root.json" option to the client so it can be used with the generated repository

examples/repo_example/basic_repo.py Outdated Show resolved Hide resolved
tests/test_examples.py Show resolved Hide resolved
@lukpueh
Copy link
Member Author

lukpueh commented Nov 29, 2021

Thanks for the thorough reviews, @sechkova, @joshuagl and @jku! I addressed your comments in 4327214 and 83764c6 (the latter in response to #1685 (comment)).

May I ask for another quick round of LGTMs? (NOTE: the commits are labeled WIP, because I intend to squash them before merging.)

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@joshuagl joshuagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

examples/repo_example/basic_repo.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@MVrachev MVrachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just read it and liked the comments and comparing that to tuf/repository_tool.py is such an improvement and simplification.
Great work!

As 'repository_tool' and 'repository_lib' are being deprecated,
repository metadata must to be created and maintained manually
using the low-level Metadata API. The added example code shall
serve as temporary replacement until a new repository tool is
available.

The sample code contains the following repo workflows:
 - creation of top-level metadata
 - target file handling
 - consistent snapshots
 - key management
 - top-level delegation and signing thresholds
 - target delegation
 - in-band and out-of-band metadata signing
 - writing and reading metadata files
 - root key rotation

Co-authored-by: Teodora Sechkova <tsechkova@vmware.com>
Co-authored-by: Joshua Lock <jlock@vmware.com>
Co-authored-by: Jussi Kukkonen <jku@goto.fi>

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Adds new test module that executes the basic repo example
Python script and checks that it created certain (metadata)
files.

The test module is tailored for testing similar example scripts.

Co-authored-by: Joshua Lock <jlock@vmware.com>
Co-authored-by: Jussi Kukkonen <jku@goto.fi>

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@lukpueh
Copy link
Member Author

lukpueh commented Nov 29, 2021

Thanks, everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docs: Add repository tutorial based on metadata API
6 participants