Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several accidentally quadratic functions #800

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jonjohnsonjr
Copy link

No description provided.

Signed-off-by: Jon Johnson <jon.johnson@chainguard.dev>
@@ -73,3 +73,6 @@ def __init__(
comment: Optional[str] = None,
):
check_types_and_set_values(self, locals())

def __hash__(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a correct hash of the relationship. One might instead return the triple and use it for lookups in the cache

Copy link

@billie-alsup billie-alsup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions and concerns provided inline. Also, it is unclear how spdx_none and spdx_no_assertion are handled in your set. Are they automatically translated to the strings "NONE" and "NOASSERTION" ?

@@ -73,3 +73,6 @@ def __init__(
comment: Optional[str] = None,
):
check_types_and_set_values(self, locals())

def __hash__(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From python docs:

If a class does not define an eq() method it should not define a hash() operation either;

(Note I am not a python expert.)

@@ -52,9 +52,6 @@ def parse_all_relationships(self, input_doc_dict: Dict) -> List[Relationship]:
)

package_dicts: List[Dict] = input_doc_dict.get("packages", [])
existing_relationships_without_comments: List[Relationship] = self.get_all_relationships_without_comments(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by the removal of this code. The semantics will be different between this and previous call right? This one would include the additional relationships found from the relationships.extend call above. To preserve the previous behavior, it seems that you would instead want to simply retrieve the set identified by parser_field_or_log_error, and then extend relationships with that set. But then you could also update the existing_relationships_without_comments from that new set as well. The question is whether you are purposefully changing the semantics here or not.

@@ -44,7 +48,7 @@ def is_external_doc_ref_present_in_document(external_ref_id: str, document: Docu


def validate_spdx_id(
spdx_id: str, document: Document, check_document: bool = False, check_files: bool = False
spdx_id: str, document: Document, all_spdx_ids: Set[str], check_document: bool = False, check_files: bool = False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you not also have an files_spdx_ids: Set[str] argument for the check_files case? My SPDX file has over 70K entries. The check_document seems redundant here, since you could simply make all_spdx_ids: Optional[Set[str]] instead and go by whether it was passed as None or not.

from spdx_tools.spdx.validation.validation_message import SpdxElementType, ValidationContext, ValidationMessage
from tests.spdx.fixtures import document_fixture, relationship_fixture


@pytest.mark.parametrize("related_spdx_element", ["SPDXRef-Package", SpdxNoAssertion(), SpdxNone()])
def test_valid_relationship(related_spdx_element):
relationship = Relationship(DOCUMENT_SPDX_ID, RelationshipType.DESCRIBES, related_spdx_element, comment="comment")
validation_messages: List[ValidationMessage] = validate_relationship(relationship, "SPDX-2.3", document_fixture())
validation_messages: List[ValidationMessage] = validate_relationship(relationship, "SPDX-2.3", document_fixture(), get_all_spdx_ids(document_fixture()))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, these are two different documents that have the same content. I would rather see a single document used, as was done for other functions, i.e. save document_fixture() into a local variable, and then use that as an argument to validate_relationship, and to get_all_spdx_ids.

@@ -40,7 +41,7 @@ def test_unknown_spdx_id(spdx_element_id, related_spdx_element_id, expected_mess
relationship: Relationship = relationship_fixture(
spdx_element_id=spdx_element_id, related_spdx_element_id=related_spdx_element_id
)
validation_messages: List[ValidationMessage] = validate_relationship(relationship, "SPDX-2.3", document_fixture())
validation_messages: List[ValidationMessage] = validate_relationship(relationship, "SPDX-2.3", document_fixture(), get_all_spdx_ids(document_fixture()))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer a single document object.

@billie-alsup
Copy link

Are there any requirements for backward compatibility of any of the functions whose API has changed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants