-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade ScanCode to version 32.1.0 #8470
Conversation
8026459
to
4e226aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message: Could you explain the change a bit more detailed? IIUC license findings that are references to license findings in other files are now ignored, because they already appear as findings for those other files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. I've used your wording for the commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added another comment to the ScanCode issue, it could be considered here (and in the commit message) as well: aboutcode-org/scancode-toolkit#3648 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that keeping findings that are in fact references is not currently supported by the ORT data model. That's because line ranges always have to refer to the file from the finding, not a references file. As such I consider (having an option for) keeping references to be out of scope of this PR as it would require larger refactorings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that keeping findings that are in fact references is not currently supported by the ORT data model. That's because line ranges always have to refer to the file from the finding, not a references file.
This indicates to me that you likely didn't get my point. Consider this:
- In older ScanCode versions we had
License-scancode-unkown-license-reference
findings
with start / endLine pointing to the scanned file. - In newer ScanCode version findings such as 1. do not appear, but they are replaced with the licenses
detected in the referenced file.
This commit now filters out the findings. So, compared to older ScanCode version scan results the License-scancode-unkown-license-reference
findings are missing. These missing findings can be represented in ORT's model, as the start / endLine corresponds to the actual file path. I think this is undesired and should be avoided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit now filters out the findings.
That's not correct, and I believe it's you who might be misunderstanding something.
When taking about "references" in this PR, this is not about findings that are named like License-scancode-unkown-license-reference
. It is about literal entries in the raw ScanCode JSON, e.g. a LGPL-2.1-only
finding in a file named COPYING
, whereas the finding actually comes from file COPYING.LGPLv2.1
and the text in COPYING
just refers to file COPYING.LGPLv2.1
. Please read through #8190 again to get the full picture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, I've asked for clarification over here.
See [1]. Resolves #8457. [1]: https://github.com/nexB/scancode-toolkit/releases/tag/v32.1.0 Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
See [1] for context and the test data used. [1]: aboutcode-org/scancode-toolkit#3648 Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
cbffdf2
to
38cc437
Compare
…nces License findings that are references to license findings in other files are now ignored, because they already appear as findings for those other files. Fixes #8190. Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
Make use of a new field added with ScanCode 32.1.0 in output file format 3.1.0. Note that in the per-file `detectedLicenseExpression(Spdx)` fields ScanCode seems to consolidate e.g. separate `gpl-2.0` and `gpl-2.0-plus` matches to just `gpl-2.0-plus`, which seems to be a bug in ScanCode. This is why this change also needs to update a test where previous the ScanCode key to SPDX ID mapping did not succeed if ScanCode was run without `--license-references`. Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
38cc437
to
c16d1ce
Compare
Please have a look at the individual commit messages for the details.