-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
License checker feature #501
Conversation
cmd/osv-scanner/main.go
Outdated
Name: "experimental-all-packages", | ||
Usage: "when json output is selected, prints all packages", | ||
}, | ||
&cli.BoolFlag{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the json output when the licenses is specified without an allowlist, we want it to print all packages (and their licenses) by default. If that's the case, then I'm not sure how the all-packages and licenses flags should interact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we do have a all-packages flag, maybe licenses (without allowlist) shouldn't print all packages by default, and just fill in the field for packages that already exists? If someone wants full license info for everything, pass the all-packages flag.
@oliverchang thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure I understand the alternatives here, is it:
--licenses
(without--all-packages
) will just ignore all Unknown licenses?--licenses
implies--all-packages
?
This is complicated by the behaviour for when an allow-list is passed, because to be consistent with the vulnerability scanning behaviour, it makes more sense for --licenses <ALLOWLIST>
to not imply --all-packages
.
Thinking about this a bit more, all packages should in fact have a license right? (as opposed to vulnerabilities). We just consider "Unknown" to be a license that's worthwhile reporting too, because from a license compliance perspective it's just as important.
From that perspective, 2. above seems to make more sense to me. Not necessarily because it implies --all-packages
for the sake of it, but rather that we do have something worthwhile to report for all packages when it comes to licenses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay i see.
This is complicated by the behaviour for when an allow-list is passed, because to be consistent with the vulnerability scanning behaviour it makes more sense for --licenses to not imply --all-packages.
Thinking about this, I think rather than thinking that --licenses implies --all-packages, we can think of how --licenses with an empty allowlist means every license a violation, which means all packages should be printed in the json output.
This is the solution I'm tending towards:
- --all-packages has no effect on the table output. It will always be only the vulnerabilities in the vulns table, and if licenses is specified (regardless of allowlist), all the licenses and their counts in addition to the vulns table.
- in json output
--licenses=<allowlist>
prints only packages with license violations or packages with vulns. If allowlist is empty then all packages are violating licenses (the licenses field and license_violations field will be populated for every package) - in json output
--licenses=<allowlist>
with --all-packages prints all packages whether they have a vuln or license violation or not. The license_violations field is populated for every package that has a violation, license field populated for every package.
thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JSON output makes sense to me.
Can you clarify what happens in the table output when --licenses
is passed in with an allowlist?
Would the packages that are violating the allowlist be shown individually? or only a list of violating licenses in a summary format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was imagining that it would be the same as if there was no allowlist specified, it would have a a row per license and a column with count. If we did want to do something, we could add a column in the table when an allowlist is specified like 'violation' column with value either true or false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and to clarify, all licenses, including the UNKNOWN license would be in that table whether allowlist is specified or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would a user know what package violated the allowlist without having to parse the JSON?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they wouldn't.
my thinking was listing all individual licenses, or all license violations here may clutter the view. But you do make a good point as the vulns table shows the individual vulns, but this table would only shows a summary. it doesn't mirror the behaviour of the vulns perfectly.
we did discuss in the meeting adding a summary/all-packages flag to toggle this, but iirc we concluded that we will wait for feedback first and add flags or change the table format in response to user feedback.
did you have a solution in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry! my bad my memory is wrong. For the table view we agreed to list a summary when allowlist is empty, and violations when allowlist isn't empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some more comments!
I'm getting this error in my tests, the difference seems to be the CVSS severity value of the first vuln. Looking at https://api.osv.dev/v1/vulns/CVE-2022-37434 it looks like the CVSS is reported by the API.
|
6b582a8
to
e741c98
Compare
osv.BatchedQuery We are planning to create deps.dev queries from these packages. Rather than use the osv.BatchedQuery type to construct deps.dev queries from, create a new type Package that includes package information and source information.
This makes the licenses available in the json output only. Making it available in table form and filtering by an allowlist is left as a TODO.
"no vulnerabilities found" to "no issues found"
between summary vs violations table
I don't know a lot about SARIF so maybe these are required but that would be surprising to me whereas trimming these off make it a bit consistent for editors and such. Obviously if these are required then lets close this PR
this is in preparation for the license scanning feature. the queries are structured around making requests to the osv API, we also will want to make requests to the deps.dev api. #501
maybe this will help with the failing test
Codecov Report
@@ Coverage Diff @@
## main #501 +/- ##
==========================================
- Coverage 78.82% 78.78% -0.04%
==========================================
Files 78 79 +1
Lines 5511 5742 +231
==========================================
+ Hits 4344 4524 +180
- Misses 990 1031 +41
- Partials 177 187 +10
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good! Added some comments.
Ah I see, looks like we might need to update the linter and the go version in the checks. |
I want to use errors.Join in the following PR: - License checker feature google#501 It is a method added in go 1.20.
I want to use errors.Join in the following PR: - License checker feature #501 It is a method added in go 1.20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
case errors.Is(err, osvscanner.OnlyUncalledVulnerabilitiesFoundErr): | ||
// TODO: Discuss whether to have a different exit code | ||
// now that running call analysis is not default. | ||
return 0b0010 // 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we're complicating the return codes here by having so many cases and different error types. If users do need to distinguish these, this should be done through another mechanism (e.g. the JSON output or models.VulnerabilityResults
output).
I suggest we just return 1 if there is any error at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yyeaahhh I'd say for now best to keep it simple and find out if people have a strong need for them - as you've eluded to @oliverchang, I think as well the scanner will grow to a point that fine-grain errors should be handled via e.g. JSON just because of all the different facets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to return 0 or 1 here and sync with @another-rex so that these changes can be done alongside his changes to make call analysis return 0 or 1 as well.
I may not get to implementing error details in the models.VulnerabilityResults
output soon but I'll file an issue for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Note that I don't think #675 is necessary -- all the data is there in models.VulnerabilityResults already.
|
||
return vulnerabilityResults, nil | ||
switch { | ||
case !vuln && !onlyUncalledVuln && !licenseViolation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning to this :) Sorry I didn't get a chance to review this earlier.
Similar to my other comment, this is getting very complicated. I suggest we just fold this into a single VulnerabilitiesFoundErr
if there are any errors (including license errors).
If a user wants to distinguish between these errors more, they can just look at the models.VulnerabilityResults
result.
TODO: