You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For now, we are trying to detect Python dep's license name this way:
Metadata's trove classifier (trove classifiers are recommended for OSI-approved FLOSS licenses)
Metadata's license field (recommended for licenses not available for trove classifiers, e.g. FLOSS license with exceptions or EULA)
GitHub repo license
The problem with the GitHub API response for license name is that it is not version-specific, but rather HEAD-specific.
If we want to detect a license name for package:0.1.2, but the HEAD is pointing to the package:1.0.0 we can easily end up with the wrong verdict if the package has changed its license since the version 0.1.2.
What to do?
Try to implement more sophisticated heuristics (e.g. check out the code to version branch/tag, both v0.1.2 and 0.1.2, try to parse LICENSE or COPYING)
Use the GitHub API as we do now, add an additional column to the report:
| Package | License Name | License ID | License Type | License Source |
| package1:0.1.2 | Apache 2.0 License | Apache-2.0 | Permissive | External |
| package2:3.141592 | GNU General Public License v2 or any later | GPL-2.0-or-later | StrongCopyleft | External |
| package3:21.09 | Other/Proprietary License (EULA) | NA | Other | PythonMetaClassifiers |
| package4 | GPL-3.0 Linking Exception | GPL-3.0-linking-exception | WeakCopyleft | PythonMetaLicense |
| package5:2.19.2 | null | NA | Error | PythonGitHub |
Introduce a flag option --fail-license-source SOURCE_NAME, so that a user who needs stricter checks may always get notified if the GitHub API fallback with its known disadvantages is triggered.
Step 1 is arguably laborious to implement, error-prone (dep's version may not necessarily be matching the branch name or a tag), may require adding GitHub API token support (the API has a rate limit of 60 RPS, multiple requests to the API may easily lead to 429 status code for exceeding the limits, especially for checks with longish lists of deps).
I'd go with steps 3 and 3 and not implementing step 1.
The text was updated successfully, but these errors were encountered:
For now, we are trying to detect Python dep's license name this way:
The problem with the GitHub API response for license name is that it is not version-specific, but rather
HEAD
-specific.If we want to detect a license name for
package:0.1.2
, but theHEAD
is pointing to thepackage:1.0.0
we can easily end up with the wrong verdict if thepackage
has changed its license since the version0.1.2
.What to do?
v0.1.2
and0.1.2
, try to parseLICENSE
orCOPYING
)--fail-license-source SOURCE_NAME
, so that a user who needs stricter checks may always get notified if the GitHub API fallback with its known disadvantages is triggered.Step 1 is arguably laborious to implement, error-prone (dep's version may not necessarily be matching the branch name or a tag), may require adding GitHub API token support (the API has a rate limit of 60 RPS, multiple requests to the API may easily lead to 429 status code for exceeding the limits, especially for checks with longish lists of deps).
I'd go with steps 3 and 3 and not implementing step 1.
The text was updated successfully, but these errors were encountered: