Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup create_list_without_duplicates function #795

Conversation

jspeed-meyers
Copy link
Contributor

Fix #794.

The current implementation includes a relatively expensive operation
to check if an element is in a list. This commmit introduces a set
operation that is constant time.

Signed-off-by: John Speed Meyers <jsmeyers@chainguard.dev>
@jspeed-meyers
Copy link
Contributor Author

It looks like there is a failure of TypeError: unhashable type: 'Actor'. This seems similar to issue #792.

The tests passed locally on my machine: macOS Version 14.2.1 with Python 3.11.7 -- Are the tests in the CI and the local tests different in some way? I apologize. I wouldn't have submitted this PR if I knew my local tests weren't a good representation of the CI tests.

Any ideas?

for element in list_with_potential_duplicates:
if element not in list_without_duplicates:
if element not in seen_elements:
seen_elements.add(element)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of storing the element, you could store the element ID or similar to #792 store the astuple(element). This could solve the current issues in the CI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I'll look into this soon.

@jspeed-meyers
Copy link
Contributor Author

On second thought: I'm not sure this change is worth it. Sorry for the noise!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

create_list_without_duplicates Function Can be Sped Up By Using Set
2 participants