-
-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple languages for license detection #139
Comments
pombredanne
added a commit
that referenced
this issue
May 3, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
This was referenced Oct 16, 2019
pombredanne
changed the title
Add license translations and support multiple languages.
Support multiple languages for license detection
Feb 2, 2022
pombredanne
added a commit
that referenced
this issue
Feb 8, 2022
Add new language attribute to a Rule, deafulting to English. Use proper language tags and SPDX ids for licenses that were kept back. Reference: #139 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Feb 8, 2022
Add language attributes to rules Move all non-english licenses to the main directory Fix duplicates rules and licenses Add new command line option to index with non-english licenses. This is experimental. Add check for long SPDX license keys Reference: #139 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Feb 10, 2022
Reference: #139 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Feb 10, 2022
Improve in tests. Add a language tag there too. Fix licenses with incorrect or missing metadata Reference: #139 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne
added a commit
that referenced
this issue
Feb 10, 2022
Reference: #139 Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
At this stage, I think this feature is mostly there. We should track some interesting sources of license translations and this should happen in further issues:
|
All done. Closing |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Some licenses exist in multiple languages (e.g. CECILL) or have translations (e.g. Creative Commons, GPL) or may have been translated officially or not. While the strict legal equivalence of these translations is undefined, keeping all the language variants would be a great thing.
We should collect these and store these with the license data as text files, with some language code. For instance, we could store them as .LICENSE_FR_fr using locale-like codes or just a short language code (as in GE for German). We can assume only one primary canonical translation is available for a given language. If there are more, then we can use the rules instead for these extra translations.
Note that in the case where there is an obvious non equivalence legally speaking, then the translation should become a different license key altogether.
We should eventually also support returning the language of a detected license, meaning that for rules the language should be stored too (with a default to US English) when a rule is for a translation.
Follow ups:
The text was updated successfully, but these errors were encountered: