Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for external licenses in scans #2979

Merged
merged 45 commits into from
Oct 28, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
05d9197
Add support for external licenses in scans #480
KevinJi22 Jun 14, 2022
4989205
Add documentation for new ``--dir`` CLI option
KevinJi22 Jun 23, 2022
0cbfff1
Enable using installed licenses in scans #2994
KevinJi22 Jun 25, 2022
3376b9c
Add CI job to test detecting installed license
KevinJi22 Jul 10, 2022
681436d
Add documentation for installed license plugins
KevinJi22 Jul 15, 2022
e741ffd
Enable installed rules to be used in detection
KevinJi22 Jul 16, 2022
afae692
Move `licensedcode_test_utils` into main wheel
KevinJi22 Jul 18, 2022
f576dfa
Add Windows and MacOS images to Azure pipelines
KevinJi22 Jul 21, 2022
a68c4cc
Add rule and license validation when index is made
KevinJi22 Jul 22, 2022
789cf76
Add SPDX license key to example licenses
KevinJi22 Jul 25, 2022
e7809ee
Refactor CLI option for external licenses
KevinJi22 Aug 10, 2022
cdf627f
revise documentation for --additional-license-directory
KevinJi22 Aug 27, 2022
7e66c9a
fix docstrings
KevinJi22 Aug 27, 2022
9679feb
refactor API to not use additional_directories except when reindexing
KevinJi22 Aug 27, 2022
6aae7e2
Always consider multiple directories when generating index
KevinJi22 Aug 27, 2022
f13ed1d
Ensure licenses are unique when loading licenses from multiple direct…
KevinJi22 Aug 27, 2022
9b03eea
add callback for --additional-license-directory and include additiona…
KevinJi22 Aug 29, 2022
40f3be9
fix help.txt to include --additional-license-directory
KevinJi22 Aug 29, 2022
983024e
fix docs
KevinJi22 Aug 29, 2022
c8391d0
fix basic-options.rst
KevinJi22 Aug 29, 2022
1bc43af
add check in cli.py to see if cached directories file actually exists
KevinJi22 Aug 29, 2022
fc7b967
fix expected test results directory path
KevinJi22 Aug 29, 2022
597c616
fix underline in docs
KevinJi22 Aug 29, 2022
85001c1
fix expected results for external and installed license tests
KevinJi22 Aug 29, 2022
7497009
put license installation into posix azure pipeline
KevinJi22 Aug 29, 2022
d6068c6
remove setuptools and wheel from setup.py
KevinJi22 Aug 29, 2022
16513ff
change from scan to reindex licenses in license library validation test
KevinJi22 Aug 29, 2022
17df9d0
Add is_builtin field to Licenses and Rules and modify url output
KevinJi22 Sep 4, 2022
3762ca5
fix methods based on previous changes
KevinJi22 Sep 5, 2022
ba9740b
add new license provider plugin for additional licenses
KevinJi22 Sep 5, 2022
ba11f05
Test that additional license plugin works
pombredanne Sep 30, 2022
61c3283
Merge latest develop
pombredanne Sep 30, 2022
a4ebbe0
Use new "scanplugins" pytest marker
pombredanne Sep 30, 2022
1db9437
Add CHANGELOG entry
pombredanne Sep 30, 2022
8df0e27
fix expected scan results after installed license CI change
KevinJi22 Oct 2, 2022
f53886f
Reorganize additional license tests
AyanSinhaMahapatra Oct 12, 2022
5361052
Move reindex licenses to a seperate script
AyanSinhaMahapatra Oct 12, 2022
a477e54
Merge branch 'develop' into external-licenses-480
AyanSinhaMahapatra Oct 12, 2022
6412039
Add external licenses info in header
AyanSinhaMahapatra Oct 12, 2022
6e14d8a
Add is_builtin flag to matched_rule data
AyanSinhaMahapatra Oct 20, 2022
044f60d
Do not return empty strings in license data
AyanSinhaMahapatra Oct 20, 2022
f201faa
Add --only-builtin falg for scancode-reindex-licenses
AyanSinhaMahapatra Oct 21, 2022
095c8ed
Update docs for external licenses
AyanSinhaMahapatra Oct 21, 2022
54fb102
Refactor external licenses code
AyanSinhaMahapatra Oct 28, 2022
f2b1e13
Improve CHANGELOG.rst
pombredanne Oct 28, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,22 @@ License detection:
matches in a larger license detecion. This remove a larger number of false
positive or ambiguous license detections.


- The data structure of the JSON output has changed for licenses. We now
return match details once for each matched license expression rather than
once for each license in a matched expression. There is a new top-level
"license_references" attribute that contains the data details for each
detected license only once. This data can contain the reference license text
as an option.

- We can now detect licenses using custom license texts and license rules.
These can be provided as a one off in a directory or packaged as a plugin
for consistent reuse and deployment.

pombredanne marked this conversation as resolved.
Show resolved Hide resolved
- There is a new "scancode-reindex-licenses" command that replace the
"scancode --reindex-licenses" command line option which has been
removed. This new command supports simpler reindexing using custom
license texts and license rules contributed by plugins or stored in an
additional directory.
v31.2.1 - 2022-10-05
----------------------------------

Expand Down
13 changes: 13 additions & 0 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ jobs:
--ignore=tests/licensedcode/test_detection_datadriven2.py \
--ignore=tests/licensedcode/test_detection_datadriven3.py \
--ignore=tests/licensedcode/test_detection_datadriven4.py \
--ignore=tests/licensedcode/test_additional_license.py \
tests/licensedcode

license_datadriven1_2: |
Expand Down Expand Up @@ -78,6 +79,18 @@ jobs:
venv/bin/pytest -n 3 -vvs --test-suite=all \
tests/licensedcode/test_zzzz_cache.py

# this test runs in isolation because it modifies the actual
# license index with additional licenses provided by a plugin
# and we use the special --test-suite=plugins marker for these
# tests
additional_license_combined: |
venv/bin/pip install tests/licensedcode/data/additional_licenses/additional_plugin_1/
venv/bin/pip install tests/licensedcode/data/additional_licenses/additional_plugin_2/
venv/bin/scancode-reindex-licenses \
--additional-directory tests/licensedcode/data/additional_licenses/additional_dir/
venv/bin/pytest -vvs --test-suite=plugins \
tests/licensedcode/test_additional_license.py

- template: etc/ci/azure-posix.yml
parameters:
job_name: ubuntu18_cpython
Expand Down
19 changes: 17 additions & 2 deletions conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
################################################################################
SLOW_TEST = 'scanslow'
VALIDATION_TEST = 'scanvalidate'
PLUGINS_TEST = 'scanplugins'


def pytest_configure(config):
Expand All @@ -53,8 +54,14 @@ def pytest_configure(config):
': Mark a ScanCode test as a validation test, super slow, long running test.',
)

config.addinivalue_line(
'markers',
PLUGINS_TEST +
': Mark a ScanCode test as a special CI test to tests installing additional plugins.',
)


TEST_SUITES = 'standard', 'all', 'validate'
TEST_SUITES = ('standard', 'all', 'validate', 'plugins',)


def pytest_addoption(parser):
Expand All @@ -72,9 +79,11 @@ def pytest_addoption(parser):
help='Select which test suite to run: '
'"standard" runs the standard test suite designed to run reasonably fast. '
'"all" runs "standard" and "slow" (long running) tests. '
'"validate" runs all the tests. '
'"validate" runs all the tests, except the "plugins" tests. '
'"plugins" runs special plugins tests. Needs extra setup, and is used only in the CI. '
'Use the @pytest.mark.scanslow marker to mark a test as "slow" test. '
'Use the @pytest.mark.scanvalidate marker to mark a test as a "validate" test.'
'Use the @pytest.mark.scanplugins marker to mark a test as a "plugins" test.'
)

################################################################################
Expand All @@ -87,13 +96,19 @@ def pytest_collection_modifyitems(config, items):
test_suite = config.getvalue('test_suite')
run_everything = test_suite == 'validate'
run_slow_test = test_suite in ('all', 'validate')
run_only_plugins = test_suite == 'plugins'

tests_to_run = []
tests_to_skip = []

for item in items:
is_validate = bool(item.get_closest_marker(VALIDATION_TEST))
is_slow = bool(item.get_closest_marker(SLOW_TEST))
is_plugins = bool(item.get_closest_marker(PLUGINS_TEST))

if is_plugins and not run_only_plugins:
tests_to_skip.append(item)
continue

if is_validate and not run_everything:
tests_to_skip.append(item)
Expand Down
16 changes: 0 additions & 16 deletions docs/source/cli-reference/core-options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,22 +69,6 @@ Comparing Progress Message Options

----

``--reindex-licenses`` Option
-----------------------------

ScanCode maintains a license index to search for and detect licenses. When Scancode is
configured for the first time, a license index is built and used in every scan thereafter.

This ``--reindex-licenses`` option rebuilds the license index. Running a scan with this option
displays the following message to the terminal in addition to what it normally shows::

Checking and rebuilding the license index...

..
[ToDo] Research and Write Better

----

``--from-json`` Option
----------------------

Expand Down
1 change: 1 addition & 0 deletions docs/source/cli-reference/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
help-text-options
list-options
simple-examples
other-commands
basic-options
core-options
output-format
Expand Down
4 changes: 4 additions & 0 deletions docs/source/cli-reference/list-options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ available in the command line.

----

.. include:: /rst_snippets/scancode-reindex-licenses.rst

----

.. include:: /rst_snippets/core_options.rst

----
Expand Down
102 changes: 102 additions & 0 deletions docs/source/cli-reference/other-commands.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
Other available CLIs
====================

.. _other_cli:

----

.. include:: /rst_snippets/scancode-reindex-licenses.rst

----

.. include:: /rst_snippets/extract.rst

----

``scancode-reindex-licenses`` command
-------------------------------------

ScanCode maintains a license index to search for and detect licenses. When Scancode is
configured for the first time, a license index is built and used in every scan thereafter.

This ``scancode-reindex-licenses`` command rebuilds the license index. Running this command
displays the following message to the terminal::

Checking and rebuilding the license index...

This has several CLI options as follows:

``--additional-directory`` Option:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The ``--additional-directory`` option allows the user to include additional directories
of licenses to use in license detection.

This command only needs to be run once for each set of additional directories, in all subsequent
runs of Scancode with the same directories all the licenses in the directories will be cached
and used in License detection. But reindexing removes these directories, if they aren't
reintroduced as additional directories.

The directory structure should look something like this::

additional_license_directory/
├── licenses/
│ ├── example-installed-1.LICENSE
│ └── example-installed-1.yaml
├── rules/
│ ├── example-installed-1.RULE
│ └── example-installed-1.yaml

Here is an example of reindexing the license cache using the ``--additional-directory PATH`` option
with a single directory::

scancode-reindex-licenses --additional-directory tests/licensedcode/data/additional_licenses/additional_dir/

You can also include multiple directories like so::

scancode-reindex-licenses --additional-directory /home/user/external_licenses/external1 --additional-directory /home/user/external_licenses/external2

If you want to continue running scans with ``/home/user/external_licenses/external1`` and
``/home/user/external_licenses/external2``, you can simply run scans after the command above
reindexing with those directories and they will be included. ::

scancode -l --license-text --json-pp output.json samples

However, if you wanted to run a scan with a new set of directories, such as
``home/user/external_licenses/external1`` and ``home/user/external_licenses/external3``, you would
need to reindex the license index with those directories as parameters::

scancode --additional-directory /home/user/external_licenses/external1 --additional-directory /home/user/external_licenses/external3

.. include:: /rst_snippets/note_snippets/additional_directory_is_temp.rst


.. note::

You can also install external licenses through a plugin for
better reproducibility and distribution of those license/rules
for use in conjunction with scancode-toolkit licenses.
See :ref:`install_new_license_plugin`


``--only-builtin`` Option:
^^^^^^^^^^^^^^^^^^^^^^^^^^

Rebuild the license index excluding any additional license directory or additional
license plugins which were added previously, i.e. with only builtin scancode license and rules.

This is applicable when there are additional license plugins installed already and you want to
reindex the licenses without these licenses from the additional plugins.

.. note::

Running the ``--only-builtin`` command won't get rid of the installed license plugins, it
would just reindex without the licenses from these plugins for once. Another reindex afterwards
without this option would bring back the licenses from the plugins again in the index.


``--all-languages`` Option:
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Rebuild the license index including texts all languages (and not only
English) and exit. This is an EXPERIMENTAL option.
2 changes: 1 addition & 1 deletion docs/source/cli-reference/output-format.rst
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ following options.
"text_url": "http://fedoraproject.org/wiki/Licensing:MIT#Old_Style",
"reference_url": "https://enterprise.dejacode.com/urn/urn:dje:license:mit-old-style",
"spdx_license_key": null,
"spdx_url": "",
"spdx_url": null,
"start_line": 9,
"end_line": 15,
"matched_rule": {
Expand Down
2 changes: 1 addition & 1 deletion docs/source/cli-reference/synopsis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@ A sample JSON output for an individual file will look like::
"text_url": "http://fedoraproject.org/wiki/Licensing:MIT#Old_Style",
"reference_url": "https://enterprise.dejacode.com/urn/urn:dje:license:mit-old-style",
"spdx_license_key": null,
"spdx_url": "",
"spdx_url": null,
"start_line": 9,
"end_line": 15,
"matched_rule": {
Expand Down
1 change: 1 addition & 0 deletions docs/source/how-to-guides/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@

add_new_license
add_new_license_detection_rule
install_new_license_plugin
Loading