Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new rule - using wildcard for similar headers #1842

Closed
krutztq opened this issue Nov 22, 2019 · 5 comments
Closed

add new rule - using wildcard for similar headers #1842

krutztq opened this issue Nov 22, 2019 · 5 comments

Comments

@krutztq
Copy link

krutztq commented Nov 22, 2019

Description

Is it possible to create a rule with some kind of wildcard to handle license references?

Let's say we have a lot of files which all have a similar header, like:
/*
* <various comments>
* Copyright (c) 1999, Company XYZ
* All rights reserved. For further information see LICENSE file.
* <various authors>
*/
REST_OF_FILE

Is is possible to add one rule for the use case?
How should the RULE and yml files look like?

System configuration

For bug reports, it really helps us to know:

  • What OS are you running on? (Windows/MacOS/Linux)
    Ubuntu 18.04
  • What version of scancode-toolkit was used to generate the scan file?
    3.1.1
  • What installation method was used to install/run scancode? (pip/source download/other)
    download
@pombredanne
Copy link
Member

@krutztq eventually license detection use a full diff, and does not use regex or globs so there are no wild cards (there used to be more or less something like wild cards in early versions, but that was not really useful nor efficient) so it was dropped quickly).

That said in your example, I would assume that what you want to capture/not capture may be:

  • not captured: * <various comments>
  • captured by copyright detection: * Copyright (c) 1999, Company XYZ
  • not captured: * All rights reserved.
  • captured by license detection: For further information see LICENSE file.
  • not captured * <various authors> REST_OF_FILE

In this case you want a rule with this: ``For further information see LICENSE file.` as a text in a xxx.RULE fle and this data and an xxx.yml file:

license_expression: unknown-license-reference
relevance: 100
is_license_reference: yes
referenced_filenames:
    - LICENSE

@krutztq
Copy link
Author

krutztq commented Nov 25, 2019

Thanks for the quick reply.
Ok, I see. But why license_expression: unknown-license-reference? Isn't it a reference I do know?
Can I just set the referenced license license_expression: my-license here? And what happens when the referenced file is not in the (parent) directory?

@pombredanne
Copy link
Member

@krutztq

Ok, I see. But why license_expression: unknown-license-reference? Isn't it a reference I do know?

In the general case, For further information see LICENSE file. is not something that's specific to one license (so much so that I starting adding new rules for that and several variations).

This:

Company XYZ
* All rights reserved. For further information see LICENSE file.

could make this more specific, if this were always pointing to a specific license. This could work if this is private code and a private detection rule. (From what I can see in public it comes with MIT or LGPL https://www.google.com/search?q="For further information see LICENSE file." and this exact wording is not common https://github.com/search?utf8=%E2%9C%93&q=%22For+further+information+see+LICENSE+file%22&type=Code )

Can I just set the referenced license license_expression: my-license here?
And what happens when the referenced file is not in the (parent) directory?

You sure can use my-license but you need to add that as a license record in https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses ...
Every license (even the unknown kinds) need to have a record there.

@pombredanne
Copy link
Member

Note that its perfectly OK to have proprietary licenses in the public scancode-toolkit too...
And not being too righteous, but For further information see LICENSE file. is best avoided if you have control on that code... Inlining a notice would be best rather than a file reference (and even better is using an SPDX license expression and/or the http://reuse.software conventions)

@krutztq
Copy link
Author

krutztq commented Nov 26, 2019

I am using your proposal with private detection rules, this works fine for my use case. I see the advantage of a license notice and will it discuss in my team. Thanks for the declaration!

@krutztq krutztq closed this as completed Nov 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants