Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a helper function dassert_no_duplicates_dict_keys in hpandas #1091

Open
smitpatel49 opened this issue Jul 22, 2024 · 6 comments
Open

Comments

@smitpatel49
Copy link
Contributor

smitpatel49 commented Jul 22, 2024

Follow up on #1075

We want to add a new helper function dassert_no_duplicates_dict_keys here.

Also, we need to add unit tests for the new function. Apart from that, we want to remove hdbg.dassert_no_duplicates calls in the codebase with dictionary keys as an input.

FYI @samarth9008

@smitpatel49
Copy link
Contributor Author

I just wanted some clarity on creating this function. Are we trying to stop overwriting the key entry if it already exists or we want the latest entry as Python does by default. @gpsaggese, @samarth9008

@samarth9008
Copy link
Collaborator

We want to check and assert if there are duplicate entries.

@smitpatel49
Copy link
Contributor Author

For dictionary keys that can be achieved using dassert_no_duplicates and it will pass because of how python works.

@samarth9008
Copy link
Collaborator

samarth9008 commented Jul 25, 2024

Could there be any other way to check the duplicate keys? like a hack or something out of the box.

@smitpatel49
Copy link
Contributor Author

smitpatel49 commented Jul 25, 2024

I don't think there is a solution to check for a duplicate key entry if it is done in the dictionary itself, i.e. while defining it. For example:

dict_ = {
            "dummy_value_1": "1, 2",
            "dummy_value_2": "A, B",
            "dummy_value_1": "4, 5",
        }

will simply be overwritten and we will get:

dict_ = {
            "dummy_value_1": "4, 5",
            "dummy_value_2": "A, B",
        }

If we want to update the dictionary we can put a condition to provide a new key entry that does not exist previously or if we are creating a dictionary using something like a list which has multiple key entries and want to append/ update rather than overwriting it we can use something like defaultdict .

@smitpatel49
Copy link
Contributor Author

Any word on this one @samarth9008.

@smitpatel49 smitpatel49 removed their assignment Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants