Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validation guidelines #1130

Open
lukpueh opened this issue Sep 10, 2020 · 6 comments
Open

Add validation guidelines #1130

lukpueh opened this issue Sep 10, 2020 · 6 comments
Assignees
Labels
backlog Issues to address with priority for current development goals discussion Discussions related to the design, implementation and operation of the project documentation Documentation of the project as well as procedural documentation

Comments

@lukpueh
Copy link
Member

lukpueh commented Sep 10, 2020

There seems to be agreement to discontinue the securesystemslib schema facility (see secure-systems-lab/securesystemslib#183). We still need to be able to validate all inputs at the user boundary (type annotations should make this a lot easier), and provide tools to check if metadata is spec compliant (maybe we can use something like JSON schema?).
At any rate, it would be helpful for contributors to provide guidelines for validation.

@lukpueh
Copy link
Member Author

lukpueh commented Sep 10, 2020

Using in-toto style ValidationMixin might be a commendable way to validate metadata in memory (see the mixin and it's usage for details).

@lukpueh lukpueh changed the title **Add validation guidelines** Add validation guidelines Sep 10, 2020
@lukpueh lukpueh added discussion Discussions related to the design, implementation and operation of the project documentation Documentation of the project as well as procedural documentation labels Sep 10, 2020
@lukpueh
Copy link
Member Author

lukpueh commented Dec 1, 2020

secure-systems-lab/code-style-guidelines#18 has an interesting discussion about input validation, control flow and program consistency.

@joshuagl
Copy link
Member

The approach I would take to this research project, is:

Understand the current validation mechanism in use:

  • Review securesystemslib.schema, its purpose and its flaws.
  • Review in-toto's ValidationMixin, which validates metadata in memory utilising securesystemslib.schema (see example usage).

Review existing external/3rd-party solutions:

  • pydantic -- uses type annotations to provide data validation (runtime type hint validation) and settings management.
  • marshmallow -- uses schemas to provide simplified object serialisation, deserialisation, and input data validation.

Understand options for custom validation logic

For each of the three possible new approaches suggested above, I would expect some prototype code to be written to get a feel for how the approach fits with our new code. I'd be inclined to base on #1279, if it has not already been merged by the time we get to experimenting with new approaches.

Goals:
We want to be able to:

  • validate all inputs at the user boundary.
  • provide tools to check if metadata is specification compliant.

Outcomes:

  • Submit an ADR on input validation, summarising different approaches and making a recommendation for the project.

Considerations:

  • Are the stated goals appropriate/sufficient?
  • Which Python versions are supported by the various mechanisms explored?
  • pip is about to become a major user and has to vendor any dependencies we add. This may be a good argument for a custom implementation, or perhaps not if the transitive closure of dependencies to vendor is small.
  • Be aware of performance impact (see Revise schema and formats facility secure-systems-lab/securesystemslib#183 (comment)).
  • Other third-party solutions exist i.e. desert and typical.

Next steps:


See also, the related issue on input validation for metadata API: #1140

Other possibly useful references:

@MVrachev
Copy link
Collaborator

The initial version of the ADR addressing this issue is out: #1301.
It contains only two options for now ValidationMixin and pydantic.

@MVrachev
Copy link
Collaborator

Update on what has happened so far:

  1. I documented multiple validations options in ADR0007: How we are going to validate the new API codebase #1301.
  2. It was decided that I would create validation functions for each of the attributes before deciding which validation option is good for us from ADR 7.
  3. Created Prototype descriptors and validation functions #1366 to showcase how validation functions are supposed to work with python descriptors.
    After Jussi's comment here, we decided it's best to do research on how we use the metadata attributes and think about what might go wrong with them.
  4. I started creating issues for each of the metadata attributes in the format Metadata Attribute research - * where * is the name of the attribute. For example Metadata Attribute research: spec_version #1419, Metadata Attribute research: expires #1420.

I will unassign myself from this issue for now, because I am not actively working on validation guidelines ADR.
Before that, it's important to understand how do we want to operate and store all of the metadata attributes, provide validation functions for them and decide which validation option do we want to use from ADR7 or something totally new.

@MVrachev
Copy link
Collaborator

MVrachev commented Mar 17, 2022

Together with @lukpueh we have discussed that a formal ADR about validation guidelines seems too much of work and we are not sure we needed it as we have already implemented validation for all Metadata classes (see #1140 (comment)).

Even if there is no ADR there is a sense in providing some guidance about how the maintainers feel about validation, what validations options were discussed and what requirements should be taken into account when adding validation to python-tuf. It seems that the best place to answer those questions is in a blogpost published on https://theupdateframework.github.io/python-tuf/ and together with @lukpueh agree that this is the logical step that will close this issue.

@MVrachev MVrachev self-assigned this Mar 17, 2022
@MVrachev MVrachev added the backlog Issues to address with priority for current development goals label Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog Issues to address with priority for current development goals discussion Discussions related to the design, implementation and operation of the project documentation Documentation of the project as well as procedural documentation
Projects
None yet
Development

No branches or pull requests

3 participants