-
-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for external licenses in scans #2979
Add support for external licenses in scans #2979
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KevinJi22 I think this is looking good so far! I left some notes about docstrings and style.
fda3b2d
to
93cc339
Compare
There is something in your changes that make the tests fail... |
93cc339
to
b1cf4aa
Compare
@pombredanne I'm actually combining all the licenses and rules in the input directories into a single index here. In load_licenses_from_multiple_dirs, all the licenses (including the ones in the ScanCode database) are combined into a single license database, and the same is done with all the rules. The _CACHED_DIRECTORIES variable is used to determine whether a new index needs to be built. For example, if we've created an index with additional directories A and B, but now we run a scan with additional directories B and C, we'd need to recreate the combined index. I'll update this code with comments so that this is more clear. |
93e90a4
to
ec988c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few nitpicks about formatting. I have an issue trying out the new -dir
option. I am running scancode -l -dir tests/licensedcode/data/example_external_licenses/example1 --json-pp ~/Desktop/out.json /home/jono/src/scancode-toolkit-kevinji22/tests/licensedcode/data/plugin_license/external_licenses/scan/
, however I am not seeing the custom license being detected in the scan results.
7696b97
to
c5afa46
Compare
Thanks for the feedback! This bug was actually due to how This means that running the scan with the same parameters should work this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! After talking to @pombredanne, the next thing we want on top is to be able to install custom licenses using wheels. Something similar to the way we handle providing binaries to scancode via plugins: https://github.com/nexB/scancode-plugins/tree/main/builtins
We also merged the fix for the PyPI test in develop, so be sure to rebase your branch.
65dbef5
to
2243f51
Compare
@KevinJi22 Don't feel compelled to amend your commit and force push it every time you make a change. You can create as many commits as you want on this PR. 🙂 |
60d0c6a
to
0db7bab
Compare
bcede69
to
f3839c3
Compare
f9e82e9
to
392bc01
Compare
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@KevinJi22 we (@pombredanne and me) added the following changes: CLI Options:
Tests:
Misc:
Remaining work:
Otherwise tests are all green and this is ready AFAIK. |
392bc01
to
6412039
Compare
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
1506621
to
044f60d
Compare
Modified the code to add Ready to review @pombredanne! Was also wondering if there could be a |
Sure, but remember that each new added option may be a sign of our failure to automatically do the right thing without options :) |
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
65bb925
to
1ee3b9d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here some extra feedback. I think we should refactor this a bit more such that indexing licenses and rules just takes a list of directories. Everything else should IMHO just use the things already loaded. And we can simplify the detection of builtins based on a list of builtin licenses... This would help keep the code tidier IMHO. @AyanSinhaMahapatra @KevinJi22 let's chat at your convenience
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
1ee3b9d
to
cf417c4
Compare
Refactor according to the feedback. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
cf417c4
to
54fb102
Compare
I created #3137 as a follow up! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good to go. Merging
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
2c2ba92
to
f2b1e13
Compare
This is for the 2022 Google Summer of Code. This PR contains all the work that needs to be done for the project.
This adds
--additional-license-directory
as a command line option in license reindexing. This allows users to specify paths to directories of licenses and rules they'd like to use during license detection, but would not like to add to the ScanCode database of licenses. First, the user must run Scancode with the--reindex-licenses
option and they can use--additional-license-directory
to add directories of licenses or rules to the index.This involves adding a new option in
licensedcode/plugin_license.py
, and this option is used as a parameter inscancode/api.py
. In this approach, the licenses and rules contained in these additional directories are combined with the existing licenses and rules in the ScanCode database to produce a single index. The code for this is found inlicensedcode/cache.py
and the helper methods for loading these licenses and rules are found inlicensedcode/models.py
.This commit also includes unit tests to verify that license detection succeeds with additional directories and installed licenses/rules. Part of the setup for the unit test and future tests involves creating a new directory in
tests/licensedcode/data
that contains sample external licenses used in the unit tests.Fixes #480
Tasks
Run tests locally to check for errors.