Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equivalent comparison operation in membership test operations for set container type is not 100% accurate. #116557

Closed
ThePhar opened this issue Mar 10, 2024 · 5 comments
Labels
docs Documentation in the Doc dir

Comments

@ThePhar
Copy link

ThePhar commented Mar 10, 2024

Documentation

The documentation for how the in and not in operator tests work describes the following:

For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).

https://docs.python.org/3/reference/expressions.html#membership-test-operations

However, for sets this is not 100% the case as it will first attempt to hash the value to see if it exists in the set, and if not, skips over any __eq__ comparisons so the equivalent expression: any(x is e or x == e for e in y) never happens.

Putting aside me jumping down a rabbit hole trying to find out why this doesn't work, it would be nice if the docs could be updated to clarify this additional check that is done for "hashable containers" like sets.

Simplified example:

class Foo:
    def __init__(self, value: bool):
        self.value = value

    def __eq__(self, other):
        return self.value == bool(other)

    def __hash__(self):
        return hash(self.value)

a = Foo(True)

print(a in ('True',))                           # True  (__eq__ is called this works fine for lists/tuples)
print(a in {'True'})                            # False (__eq__ is never called)
print(a in {True})                              # True  (__eq__ is called after __hash__ match)
print(any(a is e or a == e for e in {'True'}))  # True  (__eq__ is called)
@ThePhar ThePhar added the docs Documentation in the Doc dir label Mar 10, 2024
@sobolevn sobolevn added the easy label Mar 10, 2024
@sobolevn
Copy link
Member

@ThePhar indeed! Would you like to send a PR? I think it would be something like:

For several container types such as list, tuple, or collections.deque

@sobolevn
Copy link
Member

wait a second :)

@pochmann3
Copy link
Contributor

print(a == 'True')
print(hash(a) == hash('True'))

prints:

True
False

Your class is broken.

From the __hash__ doc:

The only required property is that objects which compare equal have the same hash value

@sobolevn
Copy link
Member

I've checked your example once again and found that you have both __eq__ and __hash__, but they break the __eq__ logic. By Python's rules:

  • If objects are equal
  • their hashes should be equal too

and this is not the case for your Foo class!

class Foo:
     def __init__(self, value: bool):
         self.value = value
     def __eq__(self, other):
         return self.value == bool(other)

f = Foo(1)

print(any(f is e or f == e for e in {1, 2}))  # True
print(any(f is e or f == e for e in {'a', 'b'}))  # True
print(any(f is e or f == e for e in {''}))  # False

For a regular class without __hash__ it works just fine.

@sobolevn sobolevn removed the easy label Mar 10, 2024
@terryjreedy
Copy link
Member

It is documented that a == b should imply that hash(a) == hash(b). Python works hard to make this so for builtins.

hash(1) == hash(1.0) == hash(Decimal(1)) == hash(Fraction(1,1)) == 1
True

When so, hash(a) != hash(b) implies a != b. The doc intentionally says 'equivalent', not 'exactly equal'. The speed shortcut is an CPython implementation detail. not a language requirement. Closing.

Please ask questions about CPython detail on, for instance, https://discuss.python.org/c/users/7, where many more people are likely to see answers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

No branches or pull requests

4 participants