Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan of .deb file finds no license information #3260

Open
sisao opened this issue Feb 16, 2023 · 1 comment
Open

Scan of .deb file finds no license information #3260

sisao opened this issue Feb 16, 2023 · 1 comment
Labels

Comments

@sisao
Copy link

sisao commented Feb 16, 2023

Description

Scanning of Debian package deb files finds no license information.
The scan detects the file as a Debian package but not detecting license (\usr\share\doc\tar\copyright).

How To Reproduce

Download a Debian package and run scancode:

scancode -plciv --html tar.html --json-pp tar.json tar.deb

scancode -plciv --html tar.html --json-pp tar.json tar.deb
Setup plugin: scan:info...
Setup plugin: scan:packages...
Setup plugin: scan:licenses...
Setup plugin: scan:copyrights...
Setup plugin: post_scan:license-references...
Setup plugin: output:html...
Setup plugin: output:json-pp...
Collect file inventory...
Scan files for: info, packages, licenses, copyrights with 1 process(es)...
Scanned: /sources/test/tar.deb
Scanned: /sources/test/tar.deb
Filter scans...
Filter scan: info...
Filter scan: packages...
Filter scan: licenses...
Filter scan: copyrights...
Run post-scans...
Run post-scan: license-references...
Save scan results...
Save scan results as: html...
Save scan results as: json-pp...
Scanning done.
Summary: info, packages, licenses, copyrights with 1 process(es)
Errors count: 0
Scan Speed: 21.85 files/sec. 17.65 MB/sec.
Initial counts: 1 resource(s): 1 file(s) and 0 directorie(s)
Final counts: 1 resource(s): 1 file(s) and 0 directorie(s) for 826.91 KB
Timings:
scan_start: 2023-02-16T124348.946312
scan_end: 2023-02-16T124350.607741
setup_scan:licenses: 1.59s
setup: 1.59s
total: 1.70s
Removing temporary files...done.

Output json file:

{
"headers": [
{
"tool_name": "scancode-toolkit",
"tool_version": "32.0.0rc1",
"options": {
"input": [
"tar.deb"
],
"--copyright": true,
"--html": "tar.html",
"--info": true,
"--json-pp": "tar.json",
"--license": true,
"--package": true,
"--verbose": true
},
"notice": "Generated with ScanCode and provided on an "AS IS" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.",
"start_timestamp": "2023-02-16T124348.946312",
"end_timestamp": "2023-02-16T124350.607741",
"output_format_version": "3.0.0",
"duration": 1.661440134048462,
"message": null,
"errors": [],
"warnings": [],
"extra_data": {
"system_environment": {
"operating_system": "linux",
"cpu_architecture": "64",
"platform": "Linux-5.10.0-20-amd64-x86_64-with-glibc2.31",
"platform_version": "#1 SMP Debian 5.10.158-2 (2022-12-13)",
"python_version": "3.9.2 (default, Feb 28 2021, 17:03:44) \n[GCC 10.2.1 20210110]"
},
"spdx_license_list_version": "3.19",
"files_count": 1
}
}
],
"packages": [],
"dependencies": [],
"license_detections": [],
"license_references": [],
"license_rule_references": [],
"files": [
{
"path": "tar.deb",
"type": "file",
"name": "tar.deb",
"base_name": "tar",
"extension": ".deb",
"size": 846756,
"date": "2021-02-17",
"sha1": "cfa5510fec0d212506dac32538a1a761792ba034",
"md5": "483cd0fa2cc7139193af5fdfb9e1d7c1",
"sha256": "bd8e963c6edcf1c806df97cd73560794c347aa94b9aaaf3b88eea585bb2d2f3c",
"mime_type": "application/vnd.debian.binary-package",
"file_type": "Debian binary package (format 2.0), with control.tar.xz, data compression xz",
"programming_language": null,
"is_binary": true,
"is_text": false,
"is_archive": true,
"is_media": false,
"is_source": false,
"is_script": false,
"package_data": [],
"for_packages": [],
"detected_license_expression": null,
"detected_license_expression_spdx": null,
"license_detections": [],
"license_clues": [],
"percentage_of_license_text": 0,
"for_license_detections": [],
"copyrights": [],
"holders": [],
"authors": [],
"files_count": 0,
"dirs_count": 0,
"size_count": 0,
"scan_errors": []
}
]
}

System configuration

OS:
Debian 11

Scancode version:
ScanCode version: 32.0.0rc1
ScanCode Output Format version: 3.0.0
SPDX License list version: 3.19

@sisao sisao added the bug label Feb 16, 2023
@pombredanne
Copy link
Member

@sisao Thanks. Typically we extract a deb with extractcode (a universal extraction utility included of scancode). But since we should also recognize a plain deb as package, we should detect this alright

wget http://ftp.us.debian.org/debian/pool/main/t/tar/tar_1.34+dfsg-1_amd64.deb
extractcode  tar_1.34+dfsg-1_amd64.deb

.... then running the scancode scan will yield the results alright including the handling of the details of the copyright file.

I am keeping this open as I would want to have this work correctly directly off an unextracted .deb

Related issues: #3259 and aboutcode-org/scancode.io#693

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants