Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Make asset classes fields dataclass fields #863

Merged
merged 7 commits into from
Feb 17, 2025

Conversation

nassermohamedit0
Copy link
Contributor

@nassermohamedit0 nassermohamedit0 commented Feb 13, 2025

This PR fixes an issue in some asset classes as defined in the package ostorlab/assets.

Issue:

Every asset class is a dataclass (e.g., File, AndroidStore). This causes the __eq__ method to be implicitly overridden.

The dataclass implementation of __eq__ compares only the fields explicitly defined as data fields. However, fields that are initialized inside __init__ but not declared as data fields are ignored in equality comparisons.

Example: ostorlab.asset.IOSStore

@dataclasses.dataclass
@asset.selector("v3.asset.store.ios_store")
class IOSStore(asset.Asset):
    """Ios bundle target asset."""

    def __init__(self, bundle_id: str):
        self.bundle_id = bundle_id # Would not be used in ==

This causes two instances of IOSStore to be compared as equal (==) even if the bundle_id field is different. As a result, some tests may falsely succeed.

Example in agent_cloud_persist_vulnz:

def testCloudPersistVulnzAgent_whenVulnerableTargetIsIOS_addVulnzCalled(
    mocker,
    persist_vulnz_agent_no_max_queue_obj,
    minimalist_report_ios_vulnerability_message,
):
    """Ensure that vulnerability location information is set correctly for an iOS store application."""
    call_request = mocker.patch(
        "agent.apis.call_add_vulnerabilities", return_value=None
    )
    vulnz = [
        apis.Vulnerability(
            ...
            vulnerability_location=vuln_mixin.VulnerabilityLocation(
                asset=ios_store.IOSStore(bundle_id="a.b.c"), # Test does not fail if you change the bundle_id value
                metadata=[
                    vuln_mixin.VulnerabilityLocationMetadata(
                        metadata_type=vuln_mixin.MetadataType.FILE_PATH,
                        value="dir/dummy.swift",
                    )
                ],
            ),
        )
    ]

    persist_vulnz_agent_no_max_queue_obj.process(
        minimalist_report_ios_vulnerability_message
    )

    call_request.assert_called_with("https://ostorlab.co/api", "my_token", 1, vulnz) # Would eventually copare two IOSStore instances

Fix:

Make all fields of an asset class data fields.

Update:

This PR fixes an existing unit test: tests/runtimes/definitions_test::testAssetGroupDefinitionFromYaml_whenYamlIsValid_returnsValidAssetGroupDefinition, which has a bug and started failing against the current PR (see diff).

@nassermohamedit0 nassermohamedit0 requested a review from a team as a code owner February 13, 2025 18:31
Copy link

@ostorlab-ai-pr-review ostorlab-ai-pr-review bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Total Issues Found: 14
Critical Issues: 3
Suggestions: 11

Key Findings:
The code review identified several issues related to dataclass usage, particularly around redundant initializations in __init__ methods and missing docstrings. A recurring theme is the unnecessary redefinition of dataclass fields within __init__, which should be handled automatically by the dataclass decorator. Several suggestions also address the lack of default values and docstrings for class attributes, impacting code clarity and maintainability. The overall code quality can be improved by adhering more closely to dataclass best practices, reducing redundancy, and enhancing documentation.

src/ostorlab/assets/agent.py Show resolved Hide resolved
src/ostorlab/assets/android_aab.py Show resolved Hide resolved
src/ostorlab/assets/android_apk.py Show resolved Hide resolved
src/ostorlab/assets/android_store.py Show resolved Hide resolved
src/ostorlab/assets/api_schema.py Show resolved Hide resolved
src/ostorlab/assets/domain_name.py Show resolved Hide resolved
src/ostorlab/assets/file.py Show resolved Hide resolved
src/ostorlab/assets/ios_ipa.py Show resolved Hide resolved
src/ostorlab/assets/ios_store.py Show resolved Hide resolved
src/ostorlab/assets/ios_testflight.py Show resolved Hide resolved
@nassermohamedit0 nassermohamedit0 marked this pull request as draft February 13, 2025 18:53
Copy link

codecov bot commented Feb 13, 2025

Codecov Report

Attention: Patch coverage is 5.12821% with 37 lines in your changes missing coverage. Please review.

Project coverage is 62.52%. Comparing base (19a1686) to head (4e3f442).
Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
src/ostorlab/assets/agent.py 0.00% 5 Missing ⚠️
src/ostorlab/assets/api_schema.py 0.00% 4 Missing ⚠️
src/ostorlab/assets/ip.py 33.33% 4 Missing ⚠️
src/ostorlab/assets/android_aab.py 0.00% 3 Missing ⚠️
src/ostorlab/assets/android_apk.py 0.00% 3 Missing ⚠️
src/ostorlab/assets/file.py 0.00% 3 Missing ⚠️
src/ostorlab/assets/ios_ipa.py 0.00% 3 Missing ⚠️
src/ostorlab/assets/ipv4.py 0.00% 3 Missing ⚠️
src/ostorlab/assets/ipv6.py 0.00% 3 Missing ⚠️
src/ostorlab/assets/link.py 0.00% 2 Missing ⚠️
... and 4 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #863      +/-   ##
==========================================
- Coverage   62.68%   62.52%   -0.17%     
==========================================
  Files         342      342              
  Lines       14551    14540      -11     
==========================================
- Hits         9122     9091      -31     
- Misses       5429     5449      +20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nassermohamedit0 nassermohamedit0 marked this pull request as ready for review February 13, 2025 19:34
Copy link
Member

@elyousfi5 elyousfi5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs more testing
image

@nassermohamedit0
Copy link
Contributor Author

This needs more testing image

I just added data fields in dataclasses. No new behavior.

Copy link
Member

@benyissa benyissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @DataClass decorator automatically generates the init method, repr, eq

@nassermohamedit0 could u please delete the __init__ method in this case.

@nassermohamedit0
Copy link
Contributor Author

nassermohamedit0 commented Feb 14, 2025

The @DataClass decorator automatically generates the init method, repr, eq

@nassermohamedit0 could u please delete the __init__ method in this case.

I wanted to be super sure not to break anything so I left the __init__ methods. But I think they are redundant.

Copy link

@ostorlab-ai-pr-review ostorlab-ai-pr-review bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Total Issues Found: 15
Critical Issues: 1
Suggestions: 14

Key Findings:
The code review revealed several areas for improvement, primarily focused on enhancing code documentation and consistency. A key theme is the lack of docstrings for dataclass fields across multiple asset classes, which impacts readability and maintainability. Another pattern is the inconsistent use of default values and Optional annotations. One critical issue involves potential security implications with file handling in the IOSIpa class. The overall code quality is good but can be improved by addressing the identified issues. General recommendations include adding docstrings to all dataclass fields, ensuring consistent use of default values and Optional annotations, and carefully reviewing the security implications of file handling, especially in the IOSIpa class.

src/ostorlab/assets/agent.py Show resolved Hide resolved
src/ostorlab/assets/android_aab.py Show resolved Hide resolved
src/ostorlab/assets/android_apk.py Show resolved Hide resolved
src/ostorlab/assets/android_store.py Show resolved Hide resolved
src/ostorlab/assets/api_schema.py Show resolved Hide resolved
src/ostorlab/assets/domain_name.py Show resolved Hide resolved
src/ostorlab/assets/file.py Show resolved Hide resolved
src/ostorlab/assets/ios_ipa.py Show resolved Hide resolved
src/ostorlab/assets/ios_store.py Show resolved Hide resolved
src/ostorlab/assets/ios_testflight.py Show resolved Hide resolved
Copy link

@ostorlab-ai-pr-review ostorlab-ai-pr-review bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Total Issues Found: 15
Critical Issues: 0
Suggestions: 15

Key Findings:
The code review findings reveal a pattern of missing or insufficient docstrings across multiple dataclasses in the ostorlab.assets module. This impacts code readability and maintainability. Several suggestions also focus on improving the initialization of fields, particularly with Optional types, to avoid potential NoneType errors and simplify code usage. Additionally, there's a suggestion to enhance a test case with more specific positive feedback. Overall, the code quality is good, but the identified issues should be addressed to improve documentation, robustness, and code clarity.

src/ostorlab/assets/agent.py Show resolved Hide resolved
src/ostorlab/assets/android_aab.py Show resolved Hide resolved
src/ostorlab/assets/android_apk.py Show resolved Hide resolved
src/ostorlab/assets/android_store.py Show resolved Hide resolved
src/ostorlab/assets/api_schema.py Show resolved Hide resolved
src/ostorlab/assets/domain_name.py Show resolved Hide resolved
src/ostorlab/assets/file.py Show resolved Hide resolved
src/ostorlab/assets/ios_ipa.py Show resolved Hide resolved
src/ostorlab/assets/ios_store.py Show resolved Hide resolved
src/ostorlab/assets/ios_testflight.py Show resolved Hide resolved
Copy link

@ostorlab-ai-pr-review ostorlab-ai-pr-review bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Total Issues Found: 15
Critical Issues: 1
Suggestions: 14

Key Findings:
The code review identified several areas for improvement across the asset definitions. The primary theme is the lack of docstrings for attributes, which impacts readability and maintainability. Several suggestions involve adding docstrings to fields like key, content, package_name, endpoint_url, name, url, and host in various asset classes. Another recurring suggestion involves clarifying type hints, particularly using explicit type unions (e.g., bytes | None) instead of Optional[bytes] and ensuring default values are set appropriately (e.g., None for optional content). A critical issue was identified in the IOSStore class related to equality comparisons due to a dataclasses bug fix, recommending a unit test to verify correct behavior and a review of other asset classes for similar issues. Finally, a suggestion was made to use f-strings for improved readability in tests. Overall, the code quality is good but can be significantly improved by addressing the missing docstrings and the equality comparison issue. Adding the recommended docstrings and tests will enhance maintainability and reduce the risk of future bugs.

src/ostorlab/assets/agent.py Show resolved Hide resolved
src/ostorlab/assets/android_aab.py Show resolved Hide resolved
src/ostorlab/assets/android_apk.py Show resolved Hide resolved
src/ostorlab/assets/android_store.py Show resolved Hide resolved
src/ostorlab/assets/api_schema.py Show resolved Hide resolved
src/ostorlab/assets/domain_name.py Show resolved Hide resolved
src/ostorlab/assets/file.py Show resolved Hide resolved
src/ostorlab/assets/ios_ipa.py Show resolved Hide resolved
src/ostorlab/assets/ios_store.py Show resolved Hide resolved
src/ostorlab/assets/ios_testflight.py Show resolved Hide resolved
Copy link

@ostorlab-ai-pr-review ostorlab-ai-pr-review bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Total Issues Found: 15
Critical Issues: 0
Suggestions: 15

Key Findings:
The code review identified several areas for improvement across the project, primarily focusing on enhancing code readability, maintainability, and robustness. A key theme is the consistent need for docstrings to clarify the purpose of dataclass fields, which applies to almost every file reviewed in src/ostorlab/assets/. Another theme is the handling of file content in assets like Android AAB/APK/IPA, with suggestions to use file-like objects for better memory management and to clarify the use of None vs b'' for empty content. Additionally, there are recommendations for default values for optional fields, input validation, addressing redundancy, and removing unnecessary comments. Overall, the code quality is good but can be significantly improved by addressing these suggestions.

self.git_location = git_location
self.docker_location = docker_location
self.yaml_file_location = yaml_file_location
key: str

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a docstring to describe the purpose of the key field. This will improve code readability and maintainability. For example: 'key: str # Agent key.' Also, apply the same principle to the 'name', 'version', 'docker_image', and 'description' fields by adding specific docstrings to each.

Suggested change
key: str
key: str # Agent key.

self.content = content
self.path = path
self.content_url = content_url
content: Optional[bytes] = None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a docstring to explain the purpose of the content field. Consider using an empty bytes object (b"") as the default value instead of None, which might simplify handling cases where AAB content is expected. If None is retained, ensure all code accessing content includes a check for None to prevent potential errors. Provide clear guidance on how the None value will be handled, versus an empty bytes object, in different parts of the system using this asset. For example, if an empty AAB is a valid case, then b'' is more appropriate than None .

Suggested change
content: Optional[bytes] = None
content: bytes | None = b""
"""The content of the AAB file."""

self.content = content
self.path = path
self.content_url = content_url
content: Optional[bytes] = None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a docstring to explain the purpose of the content field. Suggestion: """The APK file content as bytes.""" Also, for better flexibility with large files, consider using a file-like object (e.g., io.BytesIO) instead of bytes. This would allow reading the APK content in chunks, reducing memory usage. Example: content: typing.BinaryIO and update the constructor accordingly to accept and handle file-like objects. If you switch to file-like object, make sure to update any code that reads this content, to read from the stream instead of assuming it's all in memory. Update the docstring accordingly if a file-like object is used. Suggestion: """The APK file content as a binary stream."""

Suggested change
content: Optional[bytes] = None
content: bytes = dataclasses.field(init=False, repr=False)
"""The APK file content as bytes."""

@@ -10,8 +10,7 @@
class AndroidStore(asset.Asset):
"""Android store package target asset."""

def __init__(self, package_name: str):
self.package_name = package_name
package_name: str

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a docstring to describe the purpose of 'package_name' and its usage to improve code readability and maintainability. For example: package_name: str = dataclasses.field(metadata={'description': 'The package name of the Android application.'}). Place this docstring within the AndroidStore dataclass definition, ideally directly above the package_name field declaration.

Suggested change
package_name: str
package_name: str = dataclasses.field(metadata={'description': 'The package name of the Android application.'})

self.endpoint_url = endpoint_url
self.schema_type = schema_type
self.content_url = content_url
endpoint_url: str

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a docstring to the name field to describe its purpose. This will improve code clarity and maintainability. For example: name: str = Field(description='The name of the API schema.')

Suggested change
endpoint_url: str
name: str = Field(description='The name of the API schema.')

@@ -10,11 +10,10 @@
class DomainName(asset.Asset):
"""Domain Name target asset per RFC 1034 and 1035."""

def __init__(self, name: str):
self.name = name
name: str

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a default value for the name field or using typing.Optional[str] if the field can be None. This enhances code clarity and prevents potential errors if the value is not provided during object instantiation. For example: name: str = "" or name: typing.Optional[str] = None depending on your specific use case. Only apply this fix if the field is intended to be optional. If the field is required, do not apply the suggested fix. Also, this applies to other dataclasses in the project if they lack default values and might be None during runtime.

Suggested change
name: str
name: str = ""

self.content = content
self.path = path
self.content_url = content_url
content: Optional[bytes] = None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, either add default values for path and content_url as well, or remove the default value from content to ensure uniform class initialization.

Suggested change
content: Optional[bytes] = None

self.content = content
self.path = path
self.content_url = content_url
content: Optional[bytes] = None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a docstring to explain the purpose of the content field. This improves readability and maintainability. For example: """The content of the IPA file, if available."""

Suggested change
content: Optional[bytes] = None
content: Optional[bytes] = None
"""The content of the IPA file, if available."""

@@ -10,8 +10,7 @@
class IOSStore(asset.Asset):
"""Ios bundle target asset."""

def __init__(self, bundle_id: str):
self.bundle_id = bundle_id
bundle_id: str

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the comment. It's redundant since the code change itself should be clear enough. If the fix is complex, a more detailed explanation of the fix or a link to the issue it resolves would be more helpful than a simple confirmation that it's correct.

Suggested change
bundle_id: str

@@ -10,8 +10,7 @@
class IOSTestflight(asset.Asset):
"""iOS testflight target asset."""

def __init__(self, application_url: str):
self.application_url = application_url
application_url: str

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment to describe the purpose of the application_url field. This improves code readability and maintainability by providing context. For example: application_url: str # The URL of the iOS Testflight application.

Suggested change
application_url: str
application_url: str # The URL of the iOS Testflight application.

@nassermohamedit0 nassermohamedit0 added the bug Something isn't working label Feb 17, 2025
@3asm 3asm merged commit 54a8a8c into main Feb 17, 2025
10 of 12 checks passed
@3asm 3asm deleted the fix/asset-class-use-non-data-field branch February 17, 2025 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working readability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants