-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evaluator.rules: Add handling for further license categories #77
Conversation
|
||
issue( | ||
Severity.ERROR, | ||
"The dependency ${pkg.metadata.id.toCoordinates()} is licensed under the ScanCode 'commercial' " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized too late that we're already doing this in other rules, but isn't it a bit risky to refer to ScanCode as a scanner here? This basically means users should not add commercial licenses from other scanners to the license-classifications.yml
's commercial
category, which creates an unnecessary tight coupling to the contents of evaluator.rules.kts
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rules refer to the category as defined in ScanCode. So, I do not see a problem with it, in particular since license-classifications.yml
is generate from ScanCode license DB.
The error messages are consistent and follow the current approach. Changing that IMO is out of scope of this PR. We can discuss outside of this PR whether we want to ditch the mention of ScanCode from the error message, if this is what you're after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO as long as classifications are generated from ScanCode it's good to mention it here as well to make clear where the classification is coming from.
@@ -43,11 +43,22 @@ val commercialLicenses = getLicensesForCategory("commercial") | |||
val copyleftLicenses = getLicensesForCategory("copyleft") | |||
val copyleftLimitedLicenses = getLicensesForCategory("copyleft-limited") | |||
val freeRestrictedLicenses = getLicensesForCategory("free-restricted") | |||
val genericLicenses = getLicensesForCategory("generic") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a related note, ScanCode 31.0.1 introduced an is_generic
flag in its output (see e.g. here for LicenseRef-scancode-generic-cla
). Maybe we should add support for such a flag in LicenseFinding
, upgrade ScanCode (and the parser to populate that new field), and rely on that property instead of hard-coding license names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This category is not a scancode category but is created from the is_generic
flag you mentioned. So, that is already taken into account.
I do not see how this can related to eliminating the below hard-coded license ID list.
Can you explain why you think this could be eliminated?
@@ -47,16 +47,22 @@ val genericLicenses = getLicensesForCategory("generic") | |||
val permissiveLicenses = getLicensesForCategory("permissive") | |||
val proprietaryFreeLicenses = getLicensesForCategory("proprietary-free") | |||
val publicDomainLicenses = getLicensesForCategory("public-domain") | |||
val unknownLicenses = getLicensesForCategory("unknown") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here, we could add support for ScanCode's is_unknown
to our data model and handle unknown findings generically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment on the "generic" category discussion. It is analog.
4d8489a
to
a300116
Compare
a300116
to
4ce040c
Compare
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
See also [1]. [1] https://scancode-licensedb.aboutcode.org/help.html#license-categories. Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Make the violation as severe as a copyleft violation. Signed-off-by: Frank Viernau <frank_viernau@epam.com>
This makes the code more compact. Signed-off-by: Frank Viernau <frank_viernau@epam.com>
See also [1]. [1] https://scancode-licensedb.aboutcode.org/help.html#license-categories. Signed-off-by: Frank Viernau <frank_viernau@epam.com>
The handling for the 'unstated-licenses' category has been implemented, so this comment has become obsolete. Signed-off-by: Frank Viernau <frank_viernau@epam.com>
4ce040c
to
936c2ee
Compare
|
||
issue( | ||
Severity.ERROR, | ||
"The dependency ${pkg.metadata.id.toCoordinates()} is licensed under the ScanCode 'commercial' " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO as long as classifications are generated from ScanCode it's good to mention it here as well to make clear where the classification is coming from.
val permissiveLicenses = getLicensesForCategory("permissive") | ||
val proprietaryFreeLicenses = getLicensesForCategory("proprietary-free") | ||
val publicDomainLicenses = getLicensesForCategory("public-domain") | ||
val unstatedLicenses = getLicensesForCategory("unstated-license") | ||
|
||
// Set of licenses, which are not acted upon by the below policy rules. | ||
val ignoredLicenses = listOf( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not the same as genericLicenses
? After reading the commit message I would have expected that all generic licenses are ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The set of ignored licenses is currently a real subset of the union of 'generic' and 'unknown'.
It contains only some of the generic licenses which I believe should not be investigated.
Which part of the commit message makes you believe that all generic licenses are ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't obvious why a subset was chosen because the explanation sounds like it applies to generic licenses in general. With the added comments it's clearer.
// 'unkown' category: | ||
"LicenseRef-scancode-free-unknown", | ||
"LicenseRef-scancode-license-file-reference", | ||
"LicenseRef-scancode-see-license" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the question in the previous commit, what makes these unknown licenses special so that they shall be ignored but not others from the unknownLicenses
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my view it's waste of time to investigate these.
Consider e.g. LicenseRef-scancode-see-license
flags "See license in LICENSE".
How is that actionable? Create a curation which turns the detected license into NONE because the LICENSE file is scanned anyway? ...No, that's waste of time IMO and does add value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed 'free-unkown' from the list after looking again at it's triggers.
Even though it flags on things like: license: "MIT"
That gives quite a bit of things to investigate , as the effort is always limited, investigating 'free-unknow' may not be the best place to spend it. Anyhow, let's not add it to the ignore list for now to move this PR forward.
The generic license category contains license IDs which do not represent an exact license, but triggers which only indicate that the tooling has found something which may be a license. Generic licenses do require investigation, as (if the finding is valid) a dedicated license ID needs to be created. As some of these triggers match quite broadly huge amount of manual effort can be created. Requiring the investigation of all generic triggers in fact only leads to moving the focus from working on the most important things towards working mostly on irrelevant details with the result of not getting anything done anymore. So, introduce a hard-coded list of `ignoredLicenses` containing license IDs which should not be flagged, to avoid that problem. Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Handle the 'unkown' licenses in the same way as the 'generic' licenses, because the associated license IDs also do not represent real licenses but just triggers which indicate that the tooling found something it does not know but may be relevant. Signed-off-by: Frank Viernau <frank_viernau@epam.com>
Signed-off-by: Frank Viernau <frank_viernau@epam.com>
d726f69
to
c2496af
Compare
See individual commits.
Context: #66.