-
-
Notifications
You must be signed in to change notification settings - Fork 576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve debian license detection #2390 #2518
Conversation
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
The current test for debian copyright files was wrong and misleading. This corrects the problem by having proper values in plain expected files and in detailed files. There was also a problem of test name masking where both detailed and non-detailed test methods had the same name and therefore were not running correctly at all. As a result all expected YAML files have been regenerated too. Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
This is the set of files found in a recent debian-unstable-slim Docker image. The expectations have been regenerated as-is but not yet revewied. See also: - aboutcode-org/scancode.io#128 - aboutcode-org/scancode.io#103 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Refactor debian copyright detection to add DebianCopyrightDetector class, makes changes to facilitate better copyright file parsing. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Fix bug in unstructured copyright file parsing, which always treated copyright files as structured, and regenerate tests files. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Remove `unique` and `simplify_licenses` to have non-unique and non-simplifies copyright and license information. Use with_debian_packaging instead of using with_details and skip_debian_packaging. Regenerates test for to update expectations. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
The overall approach to properly detect licenses in copyright files is this:
|
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
This commit adds new functions for parsing Structured Debian copyrights for license and copyright detections. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Adds license detection from comments and other paragraphs, regenerates test files. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Adds with_details flag to filter license detections, by not reporting license same as header/primary license and also only reporting unique license references in file paragraph. Fix bugs related to identifying debian/primary_license paragraphs. Regenerates test expectations. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
a0cf65e
to
2a503d9
Compare
Modify EnhancedDebianCopyright to be a DebianCopyright wrapper function and modify flags used for filtering and reporting. Seperate structured and unstructured parsing into different classes having the same base class and main methods. Also modify file to follow black standards. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Updates get_installed_packages to directly call parse_copyright_file function and get an object depending on structured/unstructured copyright file and then call functions with filtering flags to get detections as required. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Add tests for EnhancedDebianCopyright class and also modify test functions to adopt the new API. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
This makes declared_license also report declared license in the license paragraph of debian copyright files. Updates test expectations. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Modify get_copyrights to have unique copyrights when the unique_copyrights flag is set to True. Refer to #2390 Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Regenerate test expectations after upgrading to latest debian-inspector to parse paragraphs after double empty lines correctly, as the latest version fixes this issue. Refer to #2390 Refer to aboutcode-org/debian-inspector#17 Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
4684c98
to
3eb1808
Compare
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
40a303a
to
9c12c0b
Compare
Instead of adding a general `unknown_debian_license` rule, create a synthetic UnknownRule object and a LicenseMatch object out of the unknown license text. Also updates test expectations after reindexing licenses with new rules added from develop branch. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
For debian packages which have the same copyright, delete one from tests. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Update requirements and setup.cfg files to install the latest debian-inspector version 21.5.25 to fix the following issue: aboutcode-org/debian-inspector#17 Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
0b034af
to
2e26e25
Compare
All green! |
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
All green at last! merging now. |
This PR is to improve how we handle Debian license detection in copyright files for #2390