Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata API: change meta type in Timestamp #1446

Merged
merged 1 commit into from
Sep 20, 2021

Conversation

MVrachev
Copy link
Collaborator

@MVrachev MVrachev commented Jun 14, 2021

Fixes #1443

Description of the changes being introduced by the pull request:

In Timestamp, the only valid "meta" value is the dictionary representing
meta information for the snapshot file. This makes the API unnecessarily
complicated and requires validation that only information about snapshot
is available inside "meta".
Together with the python-tuf maintainers we decided that snapshot meta
information will not be represented by a "meta" dictionary but instead
by a MetaFile instance and with this it will diverge from the
specification.
This decision is coherent with ADR9 and the rationale
behind it is to provide easier, safer, and direct access to the
snapshot meta information.

Signed-off-by: Martin Vrachev mvrachev@vmware.com

History of modifications made on the pr:

  • Proposing snapshot_meta as a property with a setter and getter.
  • Adding annotations
  • Suggestion to remove the update function for Timestamp, will be handled in New API: Revise "update" methods in tuf/api/metadata.py #1230
  • Documenting that snapshot_meta as the "API for interacting with snapshot meta information"
    and that meta is an implementation detail that mimics the file format.
  • Decided with the python-tuf maintainers that we will change the approach to this problem
    and we will change the timestamp.meta type to MetaFile.

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may still be a discussion on "is this the way we want to solve the issue", but the patch is small so why not finish anyway...

Left a couple of small comments, also note that there is a third reference to meta['snapshot.json'] in the tests you could change (it probably wasn't there when you made this patch)

tuf/api/metadata.py Show resolved Hide resolved
tuf/api/metadata.py Outdated Show resolved Hide resolved
@jku
Copy link
Member

jku commented Jun 30, 2021

There may still be a discussion on "is this the way we want to solve the issue"

Expanding on this a bit: The context is that validating something like meta is a bit tricky and is affected by the API we provide:

  • providing a helper is already useful and makes safe API use easier but...
  • If we leave meta (the plain dictionary) easily editable (and documented as being part of the API), then we have to assume it can contain just about anything. Maybe it's fine and we just assume deserialization fails gracefully when that happens (it does seem to)?
  • If we don't want that to advertize meta as being editable, then we'd want to either
    • hide meta attribute (diverging from file format) and make snapshot_meta an actual instance variable or
    • make meta a property that is generated from the actual snapshot_meta variable or
    • annotate/document it as read-only

The last two options of course mean more code in the API itself -- and don't provide any real value. The real options are

  1. this pr
  2. remove meta, and only provide snapshot_meta attribute

Externally snapshot_meta will look the same in both cases so I guess this is good to go whatever we decide in the future?

@MVrachev MVrachev force-pushed the snapshot-property branch 2 times, most recently from 135fc69 to 06a1914 Compare July 12, 2021 14:15
@MVrachev
Copy link
Collaborator Author

MVrachev commented Jul 12, 2021

Thanks, @jku for all of your comments.

I think this pretty clearly shows that the update() function has no utility: your property setter now does the actual work with better ergonomics.
I think we should remove update()

I agree with this, but I want to remove this one and change the other update() functions in a separate pr where we can
discuss those things. The issue related to that: #1230

I agree that meta shouldn't contain whatever the user wants and we should add validation.
I proposed a possible solution that won't change the metadata structure and will perform some validation when calling
__init__ or directly changing timestamp.meta.

If we decide to remove meta and not follow strictly the specification that's another situation and this validation won't be required.

What do you think of my proposal?

@jku
Copy link
Member

jku commented Jul 12, 2021

If we decide to remove meta and not follow strictly the specification

Not following the spec is not an option on the table, we do want to implement it. The decisions on

  • whether we follow the spec and
  • whether we implement our internal data structures in a way that mimics the file format

are not related to each other and should not be mixed.

I'm not a big fan of adding multiple seemingly supported APIs for one purpose (as I think you are doing with meta property and snapshot_meta property). Also, validating dictionaries like that feels a bit futile as there are so many other ways to modify them:

timestamp.meta = {"snapshot.json": MetaFile(1)}
timestamp.meta.clear()
# oops, we now have a invalid timestamp even with the added validation

Validating standard containers is probably only possible at serialization time or by hiding the container from the caller behind a more restrictive API (like add_key/remove_key): this is what I meant when I said the other day that the more interesting validating discussions are related to the containers, not the individual attributes or classes.

But in this specific case I think we don't need to go that far: documenting timestamp.snapshot_meta as the "API for interacting with snapshot meta information" and documenting timestamp.meta as an implementation detail that mimics the file format is fine: Let's make it easy to do the right thing, instead of trying to prevent the wrong thing from happening (former is easier and leads to a better API as a side effect).

I think the only reasonable option would be to not keep meta dict in the API at all (in other words stop mimicking the file format) but there does not seem to be consensus on doing that.

@MVrachev
Copy link
Collaborator Author

Okay, I documented meta and snapshot_meta the way you suggested.
Do you think that's a suitable version?

@MVrachev
Copy link
Collaborator Author

I think I have clarified that the user should use snapshot_meta preferably even more.

@MVrachev
Copy link
Collaborator Author

I have updated the pr description describing the pr modifications that were made in the pr.
The pr is ready for another review.

@sechkova
Copy link
Contributor

Can you also fix the mypy errors in the CI?

@MVrachev
Copy link
Collaborator Author

Yes, I forgot to do it.
It's done.

Copy link
Contributor

@sechkova sechkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping both a property and an internal dictionary for snapshot seems like an overkill to me (in this particular case).
I share the opinion that we should drop the dictionary.

@MVrachev MVrachev added blocked Blocked by external factors discussion Discussions related to the design, implementation and operation of the project labels Aug 19, 2021
@MVrachev
Copy link
Collaborator Author

This issue is blocked until we come up with an agreement what to do with meta in Timestamp
after adding snapshot_meta property.

@MVrachev
Copy link
Collaborator Author

After #1547 got merged this pr will be unblocked.

@MVrachev MVrachev force-pushed the snapshot-property branch 2 times, most recently from 7bb05bd to dc2fc1f Compare August 31, 2021 11:22
@MVrachev
Copy link
Collaborator Author

This pr is no longer blocked.
After a discussion with the python-tuf maintainers, we decide that:

Snapshot meta-information will not be represented by a "meta" dictionary,
but by a MetaFile instance, and with this, it diverges from the specification.
This decision is coherent with ADR9 and the rationale behind it is to provide easier, safer,
and direct access to the snapshot meta information.

That's why I will update the pr naming and description.

@MVrachev MVrachev removed blocked Blocked by external factors discussion Discussions related to the design, implementation and operation of the project labels Aug 31, 2021
@MVrachev MVrachev requested review from jku and sechkova August 31, 2021 11:24
@MVrachev MVrachev changed the title Metadata API: snapshot_meta property in Timestamp Metadata API: change meta type in Timestamp Aug 31, 2021
tuf/api/metadata.py Outdated Show resolved Hide resolved
tuf/api/metadata.py Outdated Show resolved Hide resolved
@MVrachev
Copy link
Collaborator Author

I reword the doc strings for the Timestamp class and renamed meta to snapshot.

@coveralls
Copy link

coveralls commented Sep 13, 2021

Pull Request Test Coverage Report for Build 1244983582

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 10 of 10 (100.0%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 98.162%

Totals Coverage Status
Change from base Build 1228471781: 0.0%
Covered Lines: 3793
Relevant Lines: 3830

💛 - Coveralls

Copy link
Contributor

@sechkova sechkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we agreed on snapshot as the name of the Timestamp class member but when using it as a single variable I've suggested snapshot_meta, to avoid confusion.

The rest LGTM.

tuf/api/metadata.py Outdated Show resolved Hide resolved
tuf/ngclient/_internal/trusted_metadata_set.py Outdated Show resolved Hide resolved
tuf/ngclient/_internal/trusted_metadata_set.py Outdated Show resolved Hide resolved
tuf/ngclient/updater.py Outdated Show resolved Hide resolved
@MVrachev MVrachev force-pushed the snapshot-property branch 2 times, most recently from c332a84 to 784a445 Compare September 15, 2021 16:55
Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good I think. Timestamp.update() is now even weirder than before but you've said you'd rather remove it in a separate change: this sounds good to me.

The names in the attribute chains do end up looking a bit goofy when the name is just "snapshot" (and data type is not actually Snapshot).... @joshuagl do you have an opinion on the naming (looking at test code with e.g. self.timestamp.snapshot and timestamp.signed.snapshot). I think I can live with this but wouldn't object to calling this attribute snapshot_meta either 🤷

@@ -350,11 +350,11 @@ def test_metadata_timestamp(self):
fileinfo = MetaFile(2, 520, hashes)

self.assertNotEqual(
timestamp.signed.meta['snapshot.json'].to_dict(), fileinfo.to_dict()
timestamp.signed.snapshot.to_dict(), fileinfo.to_dict()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an example of code that looks a bit weird... we have a timestamp that is not a Timestamp, and a snapshot that is not a Snapshot -- the first one is just local variable naming of course, but naming that our API design easily leads to...

@joshuagl
Copy link
Member

The names in the attribute chains do end up looking a bit goofy when the name is just "snapshot" (and data type is not actually Snapshot).... @joshuagl do you have an opinion on the naming (looking at test code with e.g. self.timestamp.snapshot and timestamp.signed.snapshot). I think I can live with this but wouldn't object to calling this attribute snapshot_meta either 🤷

While I appreciate that it's a long variable name, I am fond of snapshot_meta. It retains some symmetry with the file formats in the spec without adding the extra layer of indirection.

@MVrachev
Copy link
Collaborator Author

MVrachev commented Sep 17, 2021

The names in the attribute chains do end up looking a bit goofy when the name is just "snapshot" (and data type is not actually Snapshot).... @joshuagl do you have an opinion on the naming (looking at test code with e.g. self.timestamp.snapshot and timestamp.signed.snapshot). I think I can live with this but wouldn't object to calling this attribute snapshot_meta either 🤷

While I appreciate that it's a long variable name, I am fond of snapshot_meta. It retains some symmetry with the file formats in the spec without adding the extra layer of indirection.

For sure snapshot_meta is clearer than just snapshot. We should shorten our names only if we are sure it won't cause confusion and as it seems that both @jku and @joshuagl have some concerns and that's why it seems better to stick to the slightly longer and descriptive snapshot_meta.
Will rename it now.

@MVrachev
Copy link
Collaborator Author

Done, renamed snapshot to snapshot_meta.
It seems there are newer versions of pylint which require fixing global-variable-not-assigned error.
Do you want me to fix that here or in another pr?

@MVrachev MVrachev requested a review from jku September 17, 2021 09:57
@jku
Copy link
Member

jku commented Sep 17, 2021

Do you want me to fix that here or in another pr?

no conflicts expected so separate PR seems like it would not be much more work?

@MVrachev
Copy link
Collaborator Author

Do you want me to fix that here or in another pr?

no conflicts expected so separate PR seems like it would not be much more work?

It won't be. Will post a pr today.

@MVrachev
Copy link
Collaborator Author

Do you want me to fix that here or in another pr?

no conflicts expected so separate PR seems like it would not be much more work?

Fixed in #1584.

@jku jku dismissed sechkova’s stale review September 20, 2021 08:40

no longer relevant

@jku
Copy link
Member

jku commented Sep 20, 2021

Can you rebase please (branch protection does not allow merging with failing checks)

In Timestamp, the only valid "meta" value is the dictionary representing
meta information for the snapshot file. This makes the API unnecessarily
complicated and requires validation that only information about snapshot
is available inside "meta".
Together with the python-tuf maintainers, we decided that snapshot meta
information will not be represented by a "meta" dictionary but instead
by a MetaFile instance and with this it will diverge from the
specification.
Additionally, to prevent confusion, I will rename the "meta" attribute
to "snapshot_meta" as this attribute will be related only to meta
information about snapshot.

This decision is coherent with ADR9 and the rationale
behind it is to provide easier, safer, and direct access to the
snapshot meta information.

Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
@MVrachev MVrachev requested a review from jku September 20, 2021 11:10
@jku jku merged commit 67c5298 into theupdateframework:develop Sep 20, 2021
@MVrachev MVrachev deleted the snapshot-property branch September 20, 2021 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metadata API: Timestamp should provide easy access to snapshot MetaFile
5 participants