Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate the JSON literal report to proper types #350

Merged
merged 2 commits into from
Nov 3, 2022

Conversation

Gankra
Copy link
Contributor

@Gankra Gankra commented Oct 31, 2022

This makes it more plausible for this format to be made available in a subcrate to allow other libraries/tools to easily deserialize it. Also makes it more obvious the places where we're being weirdly inconsistent or exposing random internals because I was lazy/hasty.

A weird side-effect of this is that the order of the fields got shuffled around. It seems the json macro will sort fields by name, while proper derives use the decl order. The latter seems preferred anyway, but unfortunately we take a huge churn to the snapshots, obfuscating that this is a semantic noop.

@Gankra
Copy link
Contributor Author

Gankra commented Oct 31, 2022

This is part 1 of the required support of #330 which I wanted to split out because it churns all the test files :(

This makes it more plausible for this format to be made available in a subcrate to allow other
libraries/tools to easily deserialize it. Also makes it more obvious the places where we're
being weirdly inconsistent or exposing random internals because I was lazy/hasty.

A weird side-effect of this is that the order of the fields got shuffled around. It seems
the json macro will sort fields by name, while proper derives use the decl order. The latter
seems preferred anyway, but unfortunately we take a huge churn to the snapshots, obfuscating that this is a semantic noop.
@Gankra
Copy link
Contributor Author

Gankra commented Oct 31, 2022

Added doc-comments to all the new types

Copy link
Collaborator

@mystor mystor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a clear improvement over what we have, I think it does a bulk of what I wanted from #312 (other than having documentation in the book), but also does bring some highlights onto the weird parts of the output format.

It the bot workflow is using any of the specific parts of the json output which are a bit weird and I commented on, it might be nice to refine those outputs into something less weird sooner rather than later so that changing it isn't quite as painful in the future, but that doesn't need to block merging this.

/// The criteria we recommend auditing the package for
pub suggested_criteria: Vec<CriteriaName>,
/// The diff (or full version) we recommend auditing
pub suggested_diff: DiffRecommendation,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this is existing, but I don't love how we expose the diff internal representation to the json output. It would be nice if we could have this use a delta-or-version like format like we do for the actual audits.toml format, so that the fact that full diffs are "null -> 1.0.0" isn't visible in public APIs.

The raw diffstat being encoded directly in there is also a bit weird.

/// The name of the package
pub name: PackageName,
/// Any notable parents the package has (can be helpful in giving context to the user)
pub notable_parents: String,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this is existing, but it's a bit odd that we encode this as a string.

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct JsonReportFailForViolationConflict {
/// These packages have the following conflicts
pub violations: SortedMap<PackageAndVersion, Vec<ViolationConflict>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is existing, but the fact we expose the internal ViolationConflict struct isn't the best API here.

Could we perhaps put a comment on some of these suboptimal API parts noting that we probably want to change this part of the json output? Now that we're actually specifying it, and building more complex tools on it, I'm more afraid about people depending on the current janky format and making it tricky for us to improve it.

@Gankra
Copy link
Contributor Author

Gankra commented Nov 1, 2022

I added FIXMEs for the places you pointed out.

I agree we should do a second pass on the schema before shipping an MVP of the bot/webapp. I don't expect the MVP to ship soon (lots of back and forth discussion and iteration on the design is inevitable), so I think landing this as-is to streamline future work on the schema is a good idea.

(I have some vague ideas around auto-generating json schemas and stuff so maybe a new library/tool is in the cards here.)

@mystor mystor merged commit 2a7efa8 into mozilla:main Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants