Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCIO does not identify the codebase source (the path) of a license detection #902

Open
DennisClark opened this issue Aug 31, 2023 · 3 comments · May be fixed by #1124
Open

SCIO does not identify the codebase source (the path) of a license detection #902

DennisClark opened this issue Aug 31, 2023 · 3 comments · May be fixed by #1124

Comments

@DennisClark
Copy link
Member

A recent scan of an FFmpeg project in SCIO returned a composite license expression that included AND proprietary-license in the various licenses, and that was totally incorrect, as there was no object in the codebase under any proprietary license. Refer to aboutcode-org/scancode-toolkit#3504 for a related problem.

The big issue here is that I could not find any way, either in the SCIO UI, or in the exported scan results, to identify the actual file (complete path name) that triggered the erroneous detections. The exported scan results only include the following:

  {
    "score": 100.0,
    "matcher": "2-aho",
    "end_line": 4182,
    "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/proprietary-license_489.RULE",
    "start_line": 4182,
    "matched_text": "    license=\"nonfree and unredistributable\"",
    "match_coverage": 100.0,
    "matched_length": 4,
    "rule_relevance": 100,
    "rule_identifier": "proprietary-license_489.RULE",
    "license_expression": "proprietary-license"
  },

  {
    "score": 100.0,
    "matcher": "2-aho",
    "end_line": 101,
    "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/proprietary-license_490.RULE",
    "start_line": 101,
    "matched_text": "  --enable-nonfree         allow use of nonfree code, the resulting libs",
    "match_coverage": 100.0,
    "matched_length": 2,
    "rule_relevance": 100,
    "rule_identifier": "proprietary-license_490.RULE",
    "license_expression": "proprietary-license"
  }

There are problems with those rules that are addressed in the SCTK issue, but the only way I could investigate the problem was to download the actual FFmpeg project and search for the files that contained the the matched_text myself. That information should have been in both the scan results and presented in some logical way in the SCIO UI. Consider the simple use case of an analyst seeing a generated license expression in SCIO and wondering where in the code the associated licenses were actually detected.

I am assuming that SCTK actually has the path name but it is not being captured by SCIO; if that is not the case, then this issue needs to be raised upstream in SCTK as well.

Initially assigning this to @AyanSinhaMahapatra but feel free to re-assign if appropriate.

@AyanSinhaMahapatra
Copy link
Member

Ack @DennisClark , note that this would be implemented as apart of #733

Initially assigning this to @AyanSinhaMahapatra

Yup, this is high on the priority. I'll create the models and updates to views. We can improve the UI for license detections view later, possibly with #450

@DennisClark
Copy link
Member Author

@AyanSinhaMahapatra if you do not have time for this, perhaps this issue is a candidate for assigning to a student or volunteer.

@AyanSinhaMahapatra
Copy link
Member

AyanSinhaMahapatra commented Jan 29, 2024

@DennisClark
update: we have added a new attribute from_file in SCTK matches, which was needed to implement this feature correctly wrt. referenced matches: aboutcode-org/scancode-toolkit#3620
I'll take a shot at this soon enough 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants