-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Make asset classes fields dataclass fields #863
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review Summary
Total Issues Found: 14
Critical Issues: 3
Suggestions: 11
Key Findings:
The code review identified several issues related to dataclass usage, particularly around redundant initializations in __init__
methods and missing docstrings. A recurring theme is the unnecessary redefinition of dataclass fields within __init__
, which should be handled automatically by the dataclass decorator. Several suggestions also address the lack of default values and docstrings for class attributes, impacting code clarity and maintainability. The overall code quality can be improved by adhering more closely to dataclass best practices, reducing redundancy, and enhancing documentation.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #863 +/- ##
==========================================
- Coverage 62.68% 62.52% -0.17%
==========================================
Files 342 342
Lines 14551 14540 -11
==========================================
- Hits 9122 9091 -31
- Misses 5429 5449 +20 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @DataClass decorator automatically generates the init method, repr, eq
@nassermohamedit0 could u please delete the __init__
method in this case.
I wanted to be super sure not to break anything so I left the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review Summary
Total Issues Found: 15
Critical Issues: 1
Suggestions: 14
Key Findings:
The code review revealed several areas for improvement, primarily focused on enhancing code documentation and consistency. A key theme is the lack of docstrings for dataclass fields across multiple asset classes, which impacts readability and maintainability. Another pattern is the inconsistent use of default values and Optional annotations. One critical issue involves potential security implications with file handling in the IOSIpa class. The overall code quality is good but can be improved by addressing the identified issues. General recommendations include adding docstrings to all dataclass fields, ensuring consistent use of default values and Optional annotations, and carefully reviewing the security implications of file handling, especially in the IOSIpa class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review Summary
Total Issues Found: 15
Critical Issues: 0
Suggestions: 15
Key Findings:
The code review findings reveal a pattern of missing or insufficient docstrings across multiple dataclasses in the ostorlab.assets
module. This impacts code readability and maintainability. Several suggestions also focus on improving the initialization of fields, particularly with Optional types, to avoid potential NoneType errors and simplify code usage. Additionally, there's a suggestion to enhance a test case with more specific positive feedback. Overall, the code quality is good, but the identified issues should be addressed to improve documentation, robustness, and code clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review Summary
Total Issues Found: 15
Critical Issues: 1
Suggestions: 14
Key Findings:
The code review identified several areas for improvement across the asset definitions. The primary theme is the lack of docstrings for attributes, which impacts readability and maintainability. Several suggestions involve adding docstrings to fields like key
, content
, package_name
, endpoint_url
, name
, url
, and host
in various asset classes. Another recurring suggestion involves clarifying type hints, particularly using explicit type unions (e.g., bytes | None
) instead of Optional[bytes]
and ensuring default values are set appropriately (e.g., None
for optional content). A critical issue was identified in the IOSStore
class related to equality comparisons due to a dataclasses bug fix, recommending a unit test to verify correct behavior and a review of other asset classes for similar issues. Finally, a suggestion was made to use f-strings for improved readability in tests. Overall, the code quality is good but can be significantly improved by addressing the missing docstrings and the equality comparison issue. Adding the recommended docstrings and tests will enhance maintainability and reduce the risk of future bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review Summary
Total Issues Found: 15
Critical Issues: 0
Suggestions: 15
Key Findings:
The code review identified several areas for improvement across the project, primarily focusing on enhancing code readability, maintainability, and robustness. A key theme is the consistent need for docstrings to clarify the purpose of dataclass fields, which applies to almost every file reviewed in src/ostorlab/assets/
. Another theme is the handling of file content in assets like Android AAB/APK/IPA, with suggestions to use file-like objects for better memory management and to clarify the use of None
vs b''
for empty content. Additionally, there are recommendations for default values for optional fields, input validation, addressing redundancy, and removing unnecessary comments. Overall, the code quality is good but can be significantly improved by addressing these suggestions.
self.git_location = git_location | ||
self.docker_location = docker_location | ||
self.yaml_file_location = yaml_file_location | ||
key: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a docstring to describe the purpose of the key field. This will improve code readability and maintainability. For example: 'key: str # Agent key.' Also, apply the same principle to the 'name', 'version', 'docker_image', and 'description' fields by adding specific docstrings to each.
key: str | |
key: str # Agent key. |
self.content = content | ||
self.path = path | ||
self.content_url = content_url | ||
content: Optional[bytes] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a docstring to explain the purpose of the content
field. Consider using an empty bytes object (b"") as the default value instead of None
, which might simplify handling cases where AAB content is expected. If None
is retained, ensure all code accessing content
includes a check for None
to prevent potential errors. Provide clear guidance on how the None
value will be handled, versus an empty bytes object, in different parts of the system using this asset. For example, if an empty AAB is a valid case, then b''
is more appropriate than None
.
content: Optional[bytes] = None | |
content: bytes | None = b"" | |
"""The content of the AAB file.""" |
self.content = content | ||
self.path = path | ||
self.content_url = content_url | ||
content: Optional[bytes] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a docstring to explain the purpose of the content
field. Suggestion: """The APK file content as bytes.""" Also, for better flexibility with large files, consider using a file-like object (e.g., io.BytesIO) instead of bytes. This would allow reading the APK content in chunks, reducing memory usage. Example: content: typing.BinaryIO
and update the constructor accordingly to accept and handle file-like objects. If you switch to file-like object, make sure to update any code that reads this content, to read from the stream instead of assuming it's all in memory. Update the docstring accordingly if a file-like object is used. Suggestion: """The APK file content as a binary stream."""
content: Optional[bytes] = None | |
content: bytes = dataclasses.field(init=False, repr=False) | |
"""The APK file content as bytes.""" |
@@ -10,8 +10,7 @@ | |||
class AndroidStore(asset.Asset): | |||
"""Android store package target asset.""" | |||
|
|||
def __init__(self, package_name: str): | |||
self.package_name = package_name | |||
package_name: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a docstring to describe the purpose of 'package_name' and its usage to improve code readability and maintainability. For example: package_name: str = dataclasses.field(metadata={'description': 'The package name of the Android application.'})
. Place this docstring within the AndroidStore
dataclass definition, ideally directly above the package_name
field declaration.
package_name: str | |
package_name: str = dataclasses.field(metadata={'description': 'The package name of the Android application.'}) |
self.endpoint_url = endpoint_url | ||
self.schema_type = schema_type | ||
self.content_url = content_url | ||
endpoint_url: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a docstring to the name
field to describe its purpose. This will improve code clarity and maintainability. For example: name: str = Field(description='The name of the API schema.')
endpoint_url: str | |
name: str = Field(description='The name of the API schema.') |
@@ -10,11 +10,10 @@ | |||
class DomainName(asset.Asset): | |||
"""Domain Name target asset per RFC 1034 and 1035.""" | |||
|
|||
def __init__(self, name: str): | |||
self.name = name | |||
name: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a default value for the name
field or using typing.Optional[str]
if the field can be None. This enhances code clarity and prevents potential errors if the value is not provided during object instantiation. For example: name: str = ""
or name: typing.Optional[str] = None
depending on your specific use case. Only apply this fix if the field is intended to be optional. If the field is required, do not apply the suggested fix. Also, this applies to other dataclasses in the project if they lack default values and might be None during runtime.
name: str | |
name: str = "" |
self.content = content | ||
self.path = path | ||
self.content_url = content_url | ||
content: Optional[bytes] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, either add default values for path
and content_url
as well, or remove the default value from content
to ensure uniform class initialization.
content: Optional[bytes] = None |
self.content = content | ||
self.path = path | ||
self.content_url = content_url | ||
content: Optional[bytes] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a docstring to explain the purpose of the content field. This improves readability and maintainability. For example: """The content of the IPA file, if available."""
content: Optional[bytes] = None | |
content: Optional[bytes] = None | |
"""The content of the IPA file, if available.""" |
@@ -10,8 +10,7 @@ | |||
class IOSStore(asset.Asset): | |||
"""Ios bundle target asset.""" | |||
|
|||
def __init__(self, bundle_id: str): | |||
self.bundle_id = bundle_id | |||
bundle_id: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the comment. It's redundant since the code change itself should be clear enough. If the fix is complex, a more detailed explanation of the fix or a link to the issue it resolves would be more helpful than a simple confirmation that it's correct.
bundle_id: str |
@@ -10,8 +10,7 @@ | |||
class IOSTestflight(asset.Asset): | |||
"""iOS testflight target asset.""" | |||
|
|||
def __init__(self, application_url: str): | |||
self.application_url = application_url | |||
application_url: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment to describe the purpose of the application_url field. This improves code readability and maintainability by providing context. For example: application_url: str # The URL of the iOS Testflight application.
application_url: str | |
application_url: str # The URL of the iOS Testflight application. |
This PR fixes an issue in some asset classes as defined in the package ostorlab/assets.
Issue:
Every asset class is a
dataclass
(e.g.,File
,AndroidStore
). This causes the__eq__
method to be implicitly overridden.The
dataclass
implementation of__eq__
compares only the fields explicitly defined as data fields. However, fields that are initialized inside__init__
but not declared as data fields are ignored in equality comparisons.Example:
ostorlab.asset.IOSStore
This causes two instances of
IOSStore
to be compared as equal (==
) even if thebundle_id
field is different. As a result, some tests may falsely succeed.Example in
agent_cloud_persist_vulnz
:Fix:
Make all fields of an asset class data fields.
Update:
This PR fixes an existing unit test:
tests/runtimes/definitions_test::testAssetGroupDefinitionFromYaml_whenYamlIsValid_returnsValidAssetGroupDefinition
, which has a bug and started failing against the current PR (see diff).