Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scancode incorrectly classifies a file with MIT license as under a properitary license too #3532

Closed
omajid opened this issue Sep 29, 2023 · 2 comments
Labels

Comments

@omajid
Copy link

omajid commented Sep 29, 2023

Description

I have a file that begins like this:

$ head CreateTokenTests.cs 
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

using System.Security.Claims;
using BenchmarkDotNet.Attributes;
using Microsoft.IdentityModel.JsonWebTokens;
using Microsoft.IdentityModel.TestUtils;
using Microsoft.IdentityModel.Tokens;

Scanning this via scancode flags this file as being under mit (which is correct) also against proprietary-license (which is wrong) matching proprietary-license_709.RULE.

How To Reproduce

$ wget https://raw.githubusercontent.com/dotnet/dotnet/main/src/source-build-externals/src/azure-activedirectory-identitymodel-extensions-for-dotnet/benchmark/Microsoft.IdentityModel.Benchmarks/CreateTokenTests.cs
$ ./scancode --json-pp - --license --unknown-licenses --license-references  CreateTokenTests.cs
Setup plugins...                 
Collect file inventory...                                                      
Scan files for: licenses with 1 process(es)...                                                                                                                
[####################] 2                                                                                                                                      
{                                                                                                                                                             
  "headers": [                                                                                                                                                
    {                                                                                                                                                         
      "tool_name": "scancode-toolkit",                                                                                                                        
      "tool_version": "32.0.7",
      "options": {                                                                                                                                            
        "input": [                                                                                                                                            
          "CreateTokenTests.cs"                                                                                                                               
        ],                                                                                                                                                    
        "--json-pp": "-",                                                                                                                                     
        "--license": true,                                                                                                                                    
        "--license-references": true,                                                                                                                         
        "--unknown-licenses": true                                                                                                                            
      },          
...
...
  "files": [                                                                                                                                                  
    {                                                                                                                                                         
      "path": "CreateTokenTests.cs",                                                                                                                          
      "type": "file",                                                                                                                                         
      "detected_license_expression": "mit AND proprietary-license",                                                                                           
      "detected_license_expression_spdx": "MIT AND LicenseRef-scancode-proprietary-license",                                                                  
      "license_detections": [                                                                                                                                 
        {                                                                                                                                                     
          "license_expression": "mit AND proprietary-license",                                                                                                
          "matches": [                                                                                                                                        
            {                                                                                                                                                 
              "score": 100.0,                                                  
              "start_line": 2,                                                                                                                                
              "end_line": 2,                                                                                                                                  
              "matched_length": 5,                                                                                                                            
              "match_coverage": 100.0,                                         
              "matcher": "2-aho",                                                                                                                             
              "license_expression": "mit",                                                                                                                    
              "rule_identifier": "mit_12.RULE",                                                                                                               
              "rule_relevance": 100,                                                                                                                          
              "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/mit_12.RULE"
            },                                                                 
            {                                                                                                                                                 
              "score": 4.04,                                                                                                                                  
              "start_line": 2,                                                                                                                                
              "end_line": 4,                                                   
              "matched_length": 4,                                                                                                                            
              "match_coverage": 4.04,
              "matcher": "3-seq",                                              
              "license_expression": "proprietary-license",                                                                                                    
              "rule_identifier": "proprietary-license_709.RULE",                                                                                              
              "rule_relevance": 100,                                                                                                                          
              "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/proprietary-license_709.RULE"                    
            }                                                                                                                                                 
          ],                                                                                                                                                  
          "identifier": "mit_and_proprietary_license-3ada3869-ed0a-2a77-e11b-1422d5133430"
        }                                                                                                                                                     
      ],

System configuration

For bug reports, it really helps us to know:

  • What OS are you running on? Linux
  • What version of scancode-toolkit was used to generate the scan file? 32.0.7
  • What installation method was used to install/run scancode? pip
@omajid omajid added the bug label Sep 29, 2023
@pombredanne
Copy link
Member

@omajid Thanks for the report. This is a bug alright.
To help visualize such an issue you can always use these command line options:

  • --license-text Include the detected licenses matched text.
  • --license-text-diagnostics In the matched license text, include diagnostic highlights surrounding with square brackets [] words that are not matched.

The output comes out this way as YAML:

    -   path: CreateTokenTests-short.cs
        type: file
        detected_license_expression: mit AND proprietary-license
        detected_license_expression_spdx: MIT AND LicenseRef-scancode-proprietary-license
        license_detections:
            -   license_expression: mit AND proprietary-license
                matches:
                    -   score: '100.0'
                        start_line: 2
                        end_line: 2
                        matched_length: 5
                        match_coverage: '100.0'
                        matcher: 2-aho
                        license_expression: mit
                        rule_identifier: mit_12.RULE
                        rule_relevance: 100
                        rule_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/mit_12.RULE
                        matched_text: Licensed under the MIT License.
                    -   score: '4.04'
                        start_line: 2
                        end_line: 4
                        matched_length: 4
                        match_coverage: '4.04'
                        matcher: 3-seq
                        license_expression: proprietary-license
                        rule_identifier: proprietary-license_709.RULE
                        rule_relevance: 100
                        rule_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/proprietary-license_709.RULE
                        matched_text: |
                            Licensed under [the] [MIT] License.

                            [using] [System].[Security].Claims;
                identifier: mit_and_proprietary_license-3ada3869-ed0a-2a77-e11b-1422d5133430

The resolution will likely to add curly braces around parts of https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/proprietary-license_709.RULE that MUST be present for this rule to be matched.

pombredanne added a commit that referenced this issue Oct 6, 2023
Reference: #3532
Reported-by: Omair Majid @omajid
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Oct 11, 2023
Reference: #3532
Reported-by: Omair Majid @omajid
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Oct 11, 2023
Reference: #3532
Reported-by: Omair Majid @omajid
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Member

Fixed in ScanCode 32.0.8. Thanks for the report and keep them coming! What you are doing on the dotnet repos looks mighty cool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants