Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback to the GitHub API to detect a Python dep's license name should be visible to a user #89

Closed
pilosus opened this issue Sep 5, 2021 · 0 comments · Fixed by #128
Assignees
Labels
enhancement New feature or request

Comments

@pilosus
Copy link
Owner

pilosus commented Sep 5, 2021

For now, we are trying to detect Python dep's license name this way:

  1. Metadata's trove classifier (trove classifiers are recommended for OSI-approved FLOSS licenses)
  2. Metadata's license field (recommended for licenses not available for trove classifiers, e.g. FLOSS license with exceptions or EULA)
  3. GitHub repo license

The problem with the GitHub API response for license name is that it is not version-specific, but rather HEAD-specific.
If we want to detect a license name for package:0.1.2, but the HEAD is pointing to the package:1.0.0 we can easily end up with the wrong verdict if the package has changed its license since the version 0.1.2.

What to do?

  1. Try to implement more sophisticated heuristics (e.g. check out the code to version branch/tag, both v0.1.2 and 0.1.2, try to parse LICENSE or COPYING)
  2. Use the GitHub API as we do now, add an additional column to the report:
| Package           | License Name                               | License ID                | License Type   | License Source        |
| package1:0.1.2    | Apache 2.0 License                         | Apache-2.0                | Permissive     | External              |
| package2:3.141592 | GNU General Public License v2 or any later | GPL-2.0-or-later          | StrongCopyleft | External              |
| package3:21.09    | Other/Proprietary License (EULA)           | NA                        | Other          | PythonMetaClassifiers |
| package4          | GPL-3.0 Linking Exception                  | GPL-3.0-linking-exception | WeakCopyleft   | PythonMetaLicense     |
| package5:2.19.2   | null                                       | NA                        | Error          | PythonGitHub          |
  1. Introduce a flag option --fail-license-source SOURCE_NAME, so that a user who needs stricter checks may always get notified if the GitHub API fallback with its known disadvantages is triggered.

Step 1 is arguably laborious to implement, error-prone (dep's version may not necessarily be matching the branch name or a tag), may require adding GitHub API token support (the API has a rate limit of 60 RPS, multiple requests to the API may easily lead to 429 status code for exceeding the limits, especially for checks with longish lists of deps).

I'd go with steps 3 and 3 and not implementing step 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant