Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple languages for license detection #139

Closed
2 tasks
pombredanne opened this issue Nov 30, 2015 · 2 comments
Closed
2 tasks

Support multiple languages for license detection #139

pombredanne opened this issue Nov 30, 2015 · 2 comments

Comments

@pombredanne
Copy link
Member

pombredanne commented Nov 30, 2015

Some licenses exist in multiple languages (e.g. CECILL) or have translations (e.g. Creative Commons, GPL) or may have been translated officially or not. While the strict legal equivalence of these translations is undefined, keeping all the language variants would be a great thing.
We should collect these and store these with the license data as text files, with some language code. For instance, we could store them as .LICENSE_FR_fr using locale-like codes or just a short language code (as in GE for German). We can assume only one primary canonical translation is available for a given language. If there are more, then we can use the rules instead for these extra translations.

Note that in the case where there is an obvious non equivalence legally speaking, then the translation should become a different license key altogether.

We should eventually also support returning the language of a detected license, meaning that for rules the language should be stored too (with a default to US English) when a rule is for a translation.

Follow ups:

pombredanne added a commit that referenced this issue May 3, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne pombredanne changed the title Add license translations and support multiple languages. Support multiple languages for license detection Feb 2, 2022
pombredanne added a commit that referenced this issue Feb 8, 2022
Add new language attribute to a Rule, deafulting to English.

Use proper language tags and SPDX ids for licenses that were kept back.

Reference: #139
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Feb 8, 2022
Add language attributes to rules

Move all non-english licenses to the main directory

Fix duplicates rules and licenses

Add new command line option to index with non-english licenses. This
is experimental.

Add check for long SPDX license keys

Reference: #139
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Feb 10, 2022
Reference: #139
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Feb 10, 2022
Improve in tests. Add a language tag there too.
Fix licenses with incorrect or missing metadata

Reference: #139
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Feb 10, 2022
Reference: #139
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Member Author

pombredanne commented Mar 27, 2024

At this stage, I think this feature is mostly there. We should track some interesting sources of license translations and this should happen in further issues:

@pombredanne
Copy link
Member Author

All done. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant