-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/groh/enhance spdx lite output #99
Conversation
Warning Rate limit exceeded@agustingroh has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 8 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughThe changes update the software version to 1.20.1 and add a new changelog entry detailing Telco compliance enhancements. The Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Unit Test
participant SpdxLite as SpdxLite Class
participant FS as File System
Test->>SpdxLite: produce_from_file(input_file)
SpdxLite->>SpdxLite: parse(data) (includes new url_hash & download_url)
SpdxLite->>SpdxLite: produce_from_json(data)
SpdxLite->>FS: Write SPDX JSON output
Test->>FS: Read output file for verification
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
tests/test_spdxlite.py (3)
4-4
: Update copyright year.The copyright year is set to 2025, which is beyond the current year.
Apply this diff to update the copyright year:
- Copyright (c) 2025, SCANOSS + Copyright (c) 2024, SCANOSS
32-35
: Rename test class to be more descriptive.The test class name
MyTestCase
is too generic. Consider renaming it to better reflect what it's testing.Apply this diff to use a more descriptive name:
-class MyTestCase(unittest.TestCase): +class TestSpdxLite(unittest.TestCase):
36-70
: Enhance test coverage and readability.The test method verifies key SPDX fields and checksums, but there are several areas for improvement:
- The test method name could be more descriptive
- Missing docstring to explain test purpose
- Missing test cases for error scenarios
- Magic numbers and strings should be constants
Apply this diff to improve the test:
- def testSpdxLite(self): + EXPECTED_SPDX_VERSION = "SPDX-2.2" + EXPECTED_SPDX_ID = "SPDXRef-DOCUMENT" + EXPECTED_NAME = "SCANOSS-SBOM" + EXPECTED_ORGANIZATION = "Organization: SCANOSS" + EXPECTED_CREATION_INFO_COMMENT = "SBOM Build information - SBOM Type: Build" + EXPECTED_DOCUMENT_DESCRIBES_COUNT = 5 + EXPECTED_PACKAGES_COUNT = 5 + EXPECTED_CHECKSUM_ALGORITHM = "MD5" + EXPECTED_MD5_LENGTH = 32 + + def test_spdx_lite_generates_valid_output(self): + """ + Test that SpdxLite generates a valid SPDX output file with correct fields and checksums. + """ temp_dir = tempfile.gettempdir() spdx_lite_output = os.path.join(temp_dir, "spdxlite.json") test_data_dir = os.path.dirname(os.path.abspath(__file__)) file_name = 'result.json' input_file_name = os.path.join(test_data_dir, 'data', file_name) spdx_lite = SpdxLite(debug = False, output_file=spdx_lite_output) spdx_lite.produce_from_file(input_file_name) - md5_length = 32 # Read data using absolute path with open(spdx_lite_output, 'r') as f: parsed_data = json.load(f) spdx_version = parsed_data.get("spdxVersion") spdx_id = parsed_data.get("SPDXID") name = parsed_data.get("name") organization = parsed_data.get("creationInfo",{}).get('creators')[2] creation_info_comment = parsed_data.get("creationInfo", {}).get('comment') document_describes = parsed_data.get("documentDescribes") packages = parsed_data.get("packages") - self.assertEqual(spdx_version, "SPDX-2.2") - self.assertEqual(spdx_id, "SPDXRef-DOCUMENT") - self.assertEqual(name, "SCANOSS-SBOM") - self.assertEqual(organization, "Organization: SCANOSS") - self.assertEqual(creation_info_comment, "SBOM Build information - SBOM Type: Build") - self.assertEqual(len(document_describes), 5) - self.assertEqual(len(packages), 5) + self.assertEqual(spdx_version, self.EXPECTED_SPDX_VERSION) + self.assertEqual(spdx_id, self.EXPECTED_SPDX_ID) + self.assertEqual(name, self.EXPECTED_NAME) + self.assertEqual(organization, self.EXPECTED_ORGANIZATION) + self.assertEqual(creation_info_comment, self.EXPECTED_CREATION_INFO_COMMENT) + self.assertEqual(len(document_describes), self.EXPECTED_DOCUMENT_DESCRIBES_COUNT) + self.assertEqual(len(packages), self.EXPECTED_PACKAGES_COUNT) for package in packages: for checksum in package.get("checksums", []): - self.assertEqual(checksum.get("algorithm"), "MD5") #Check all algorithms be MD5 - self.assertEqual(len(checksum.get("checksumValue")), md5_length) #Check checksum length value be 32 + self.assertEqual(checksum.get("algorithm"), self.EXPECTED_CHECKSUM_ALGORITHM) + self.assertEqual(len(checksum.get("checksumValue")), self.EXPECTED_MD5_LENGTH) os.remove(spdx_lite_output) #Removes tmp spdxlite.json file + + def test_spdx_lite_handles_invalid_input(self): + """ + Test that SpdxLite handles invalid input gracefully. + """ + temp_dir = tempfile.gettempdir() + spdx_lite_output = os.path.join(temp_dir, "spdxlite.json") + spdx_lite = SpdxLite(debug=False, output_file=spdx_lite_output) + + # Test with non-existent file + with self.assertRaises(FileNotFoundError): + spdx_lite.produce_from_file("non_existent_file.json") + + # Test with invalid JSON + with tempfile.NamedTemporaryFile(mode='w', suffix='.json', delete=False) as f: + f.write("invalid json") + invalid_json_file = f.name + + try: + with self.assertRaises(json.JSONDecodeError): + spdx_lite.produce_from_file(invalid_json_file) + finally: + os.remove(invalid_json_file)src/scanoss/spdxlite.py (3)
130-130
: Fix line length issue.The line exceeds the maximum length of 120 characters.
Apply this diff to fix the line length:
- for field in ['id', 'vendor', 'component', 'version', 'latest', 'url', 'url_hash', 'download_url']: + fields = ['id', 'vendor', 'component', 'version', + 'latest', 'url', 'url_hash', 'download_url'] + for field in fields:
189-194
: Remove extraneous f-string prefix.The string literal doesn't contain any placeholders, so the
f
prefix is unnecessary.Apply this diff to remove the extraneous f-string prefix:
'creators': [ f'Tool: SCANOSS-PY: {__version__}', f'Person: {getpass.getuser()}', - f'Organization: SCANOSS' + 'Organization: SCANOSS' ],🧰 Tools
🪛 Ruff (0.8.2)
192-192: f-string without any placeholders
Remove extraneous
f
prefix(F541)
🪛 GitHub Actions: Lint
[error] 192-192: F541 [*] f-string without any placeholders. Help: Remove extraneous
f
prefix.
229-231
: Improve variable naming and documentation.The code uses a fallback value for
url_hash
but lacks documentation explaining its purpose.Apply this diff to improve the code:
- download_location = comp.get('download_url') or comp.get('url') - url_hash = comp.get('url_hash') or '0' * 32 #Creates a string of 32 zeros to represent an empty MD5 hash for components missing a checksum + # Use download_url if available, fallback to url + download_location = comp.get('download_url') or comp.get('url') + + # Use url_hash if available, fallback to empty MD5 hash (32 zeros) + # This represents components without a checksum while maintaining MD5 format + EMPTY_MD5_HASH = '0' * 32 + url_hash = comp.get('url_hash') or EMPTY_MD5_HASH🧰 Tools
🪛 Ruff (0.8.2)
230-230: Line too long (150 > 120)
(E501)
🪛 GitHub Actions: Lint
[error] 230-230: E501 Line too long (150 > 120).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
CHANGELOG.md
(2 hunks)src/scanoss/__init__.py
(1 hunks)src/scanoss/spdxlite.py
(5 hunks)tests/test_spdxlite.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/scanoss/init.py
🧰 Additional context used
🪛 Ruff (0.8.2)
src/scanoss/spdxlite.py
33-33: commoncode.fetch.download_url
imported but unused
Remove unused import: commoncode.fetch.download_url
(F401)
192-192: f-string without any placeholders
Remove extraneous f
prefix
(F541)
230-230: Line too long (150 > 120)
(E501)
🪛 GitHub Actions: Lint
src/scanoss/spdxlite.py
[error] 25-25: I001 [*] Import block is un-sorted or un-formatted. Help: Organize imports.
[error] 33-33: F401 [*] commoncode.fetch.download_url
imported but unused. Help: Remove unused import: commoncode.fetch.download_url
.
[error] 67-67: PLR0912 Too many branches (22 > 12).
[error] 67-67: PLR0915 Too many statements (60 > 50).
[error] 76-76: F541 [*] f-string without any placeholders. Help: Remove extraneous f
prefix.
[error] 162-162: PLR0912 Too many branches (15 > 12).
[error] 162-162: PLR0915 Too many statements (55 > 50).
[error] 185-185: F541 [*] f-string without any placeholders. Help: Remove extraneous f
prefix.
[error] 192-192: F541 [*] f-string without any placeholders. Help: Remove extraneous f
prefix.
[error] 230-230: E501 Line too long (150 > 120).
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (1)
CHANGELOG.md (1)
12-14
: LGTM!The changelog entry follows the Keep a Changelog format and includes all necessary information:
- Version and date
- Added section with clear description
- Reference link for version comparison
Also applies to: 463-464
src/scanoss/spdxlite.py
Outdated
@@ -30,6 +30,7 @@ | |||
import getpass | |||
import re | |||
import importlib_resources | |||
from commoncode.fetch import download_url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused import.
The download_url
import is not used in the code.
Apply this diff to remove the unused import:
-from commoncode.fetch import download_url
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from commoncode.fetch import download_url |
🧰 Tools
🪛 Ruff (0.8.2)
33-33: commoncode.fetch.download_url
imported but unused
Remove unused import: commoncode.fetch.download_url
(F401)
🪛 GitHub Actions: Lint
[error] 33-33: F401 [*] commoncode.fetch.download_url
imported but unused. Help: Remove unused import: commoncode.fetch.download_url
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (2)
src/scanoss/spdxlite.py (2)
25-34
: 🛠️ Refactor suggestionSort and format import block.
The import block needs to be organized according to the linting rules.
Apply this diff to sort and format the imports:
-import json -import os.path -import sys -import hashlib -import datetime -import getpass -import re -import importlib_resources - -from . import __version__ +import datetime +import getpass +import hashlib +import importlib_resources +import json +import os.path +import re +import sys + +from . import __version__🧰 Tools
🪛 GitHub Actions: Lint
[error] 25-25: I001 [*] Import block is un-sorted or un-formatted. Organize imports.
75-75
: 🛠️ Refactor suggestionRemove extraneous f-string prefixes.
Several string literals don't contain any placeholders.
Apply this diff to fix the f-strings:
- self.print_debug(f'Processing raw results into summary format...') + self.print_debug('Processing raw results into summary format...') - 'SPDXID': f'SPDXRef-DOCUMENT', + 'SPDXID': 'SPDXRef-DOCUMENT',Also applies to: 184-184
🧰 Tools
🪛 Ruff (0.8.2)
75-75: f-string without any placeholders
Remove extraneous
f
prefix(F541)
🪛 GitHub Actions: Lint
[error] 75-75: F541 [*] f-string without any placeholders. Remove extraneous
f
prefix.
🧹 Nitpick comments (1)
src/scanoss/spdxlite.py (1)
66-66
: Consider refactoring complex methods.The
parse
andproduce_from_json
methods exceed the recommended complexity limits:
- Too many branches (22/15 > 12)
- Too many statements (60/55 > 50)
Consider breaking down these methods into smaller, more focused functions to improve maintainability. I can help generate a refactored implementation if you'd like.
Also applies to: 161-161
🧰 Tools
🪛 Ruff (0.8.2)
66-66: Too many branches (22 > 12)
(PLR0912)
66-66: Too many statements (60 > 50)
(PLR0915)
🪛 GitHub Actions: Lint
[error] 66-66: PLR0912 Too many branches (22 > 12).
[error] 66-66: PLR0915 Too many statements (60 > 50).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/scanoss/spdxlite.py
(4 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
src/scanoss/spdxlite.py
191-191: f-string without any placeholders
Remove extraneous f
prefix
(F541)
🪛 GitHub Actions: Lint
src/scanoss/spdxlite.py
[error] 25-25: I001 [*] Import block is un-sorted or un-formatted. Organize imports.
[error] 66-66: PLR0912 Too many branches (22 > 12).
[error] 66-66: PLR0915 Too many statements (60 > 50).
[error] 75-75: F541 [*] f-string without any placeholders. Remove extraneous f
prefix.
[error] 161-161: PLR0912 Too many branches (15 > 12).
[error] 161-161: PLR0915 Too many statements (55 > 50).
[error] 184-184: F541 [*] f-string without any placeholders. Remove extraneous f
prefix.
[error] 191-191: F541 [*] f-string without any placeholders. Remove extraneous f
prefix.
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (2)
src/scanoss/spdxlite.py (2)
129-130
: LGTM! New fields added for enhanced SPDX output.The addition of
url_hash
anddownload_url
fields aligns with the PR objective of enhancing SPDXLite output for Telco standards compliance.
228-250
: LGTM! Enhanced download location and checksum handling.The changes improve the SPDX output by:
- Using
download_url
with fallback tourl
- Adding MD5 checksum structure with proper fallback to empty hash
src/scanoss/spdxlite.py
Outdated
'creators': [ | ||
f'Tool: SCANOSS-PY: {__version__}', | ||
f'Person: {getpass.getuser()}', | ||
f'Organization: SCANOSS' | ||
], | ||
"comment": "SBOM Build information - SBOM Type: Build", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove extraneous f-string prefix.
The string literal for the organization doesn't contain any placeholders.
Apply this diff to fix the f-string:
- f'Organization: SCANOSS'
+ 'Organization: SCANOSS'
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
'creators': [ | |
f'Tool: SCANOSS-PY: {__version__}', | |
f'Person: {getpass.getuser()}', | |
f'Organization: SCANOSS' | |
], | |
"comment": "SBOM Build information - SBOM Type: Build", | |
'creators': [ | |
f'Tool: SCANOSS-PY: {__version__}', | |
f'Person: {getpass.getuser()}', | |
'Organization: SCANOSS' | |
], | |
"comment": "SBOM Build information - SBOM Type: Build", |
🧰 Tools
🪛 Ruff (0.8.2)
191-191: f-string without any placeholders
Remove extraneous f
prefix
(F541)
🪛 GitHub Actions: Lint
[error] 191-191: F541 [*] f-string without any placeholders. Remove extraneous f
prefix.
df426b8
to
50dd78a
Compare
8901fb3
to
f550774
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/scanoss/spdxlite.py (3)
67-143
: Consider refactoring to reduce complexity.The
parse
method has high cyclomatic complexity as indicated by the noqa comments. Consider breaking it down into smaller, more focused methods:
- Extract dependency processing logic
- Extract file ID processing logic
- Create helper methods for common operations
Example refactor:
def parse(self, data: json): if not self._validate_input(data): return None summary = {} for file_path, file_details in data.items(): self._process_file(file_path, file_details, summary) return summary def _process_file(self, file_path: str, file_details: dict, summary: dict): for detail in file_details: if detail.get('id') == 'dependency': self._process_dependency(file_path, detail, summary) else: self._process_file_id(file_path, detail, summary) def _process_dependency(self, file_path: str, detail: dict, summary: dict): # Extract dependency processing logic... def _process_file_id(self, file_path: str, detail: dict, summary: dict): # Extract file ID processing logic...
246-251
: Consider using a more secure hashing algorithm.MD5 is cryptographically broken and not recommended for security purposes. Consider using SHA-256 or SHA-3 for checksums.
- "algorithm": "MD5", + "algorithm": "SHA256",
229-230
: Consider using a more descriptive fallback value.Instead of using a string of zeros for the fallback hash, consider using a more descriptive value like "NO-HASH-AVAILABLE" or implement proper hash calculation.
- url_hash = comp.get('url_hash') or '0' * 32 #Creates a string of 32 zeros to represent an empty MD5 hash + url_hash = comp.get('url_hash') or 'NO-HASH-AVAILABLE'
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/scanoss/spdxlite.py
(8 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (2)
src/scanoss/spdxlite.py (2)
191-193
: Remove extraneous f-string prefix.The string literal for the organization doesn't contain any placeholders.
187-195
: LGTM! Enhanced metadata improves compliance.The addition of organization information and build type metadata in
creationInfo
improves compliance with Telco standards.
f550774
to
35a1f15
Compare
What
Summary by CodeRabbit