-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py2/py3: mutable objects and hashes #268
Comments
Also is possible to make objects immutable, but it will ruin backward compatibility and I'm not sure if result will be usable. |
I just got bitten by this... having said that, I vote for Combine the best of both worlds by implementing hash for objects that have some immutable attributes approach. |
There's another problem too, which is that we've got things like rdata which are currently mutable but which have non-trivial hashes. For these, it might be best to just make them immutable. I don't think that's much of an inconvenience in most cases. It might be annoying for people processing zones though. Imagine something supplying a "change every MX target name equal to X to Y". You could write this by direct mutation today, but that code would need to change to do a replace instead of a mutation. I think this is probably OK, but we should definitely agree this is the way to go before anyone writes anything :) |
I have no idea how big trouble it would be to make rdata & rrsets immutable... In general I'm in favor of this change but really do not have enough data to judge impact on users. |
I'll ask a question about it on dnspython-users. Another alternative is to leave things as-is, define hashes for other structures (like rdatasets and rrsets), and simply say "if you're using an object in a dictionary, you'd better not be changing it". This is (possibly!) pragmatic, in that it allows both the "treat as immutable" and "manipulate rdata and rdatasets by modifying them" usage, but is also dangerous in that inevitably someone will mutate an object that is in a dictionary, changing its hash. |
rdata is now immutable |
Almost all objects in python DNS that have implemented
__eq__
have no__hash__
method implemented and even majority of them are mutable. This has several consequences.Let's assume following class
Example 1.
__hash__
not implementedid()
as hash (hash may be randomized in newer py versions)unhashable type
if__eq__
is implemented without__hash__
However due usage of
id()
as hash Py2 works unexpectedly:Example 2.
__hash__
implementedPython2/3:
However as was said before, majority of objects in dnspython are mutable and change of attribute may cause undesired side effects in cases where hash() has been used.
Python2/3:
Summary
hash()
based onid()
when__eq__
is implemented ==> safe)__eq__
, but because by default it useshash()
it violates basic assumptiona == b --> hash(a) == hash(b)
and may result in incorrect results when sets are used (or any functions which rely on hashing) [Example 1]__hash__
method that respecta == b --> hash(a) == hash(b)
solves [Example 1] for immutable objects, but because dnspython uses mainly mutable objects we are hitting [Example 2]. If object used in a set (for example) is changed, then the set is incorrect and must be recreated to work again.Possible solutions
__hash__
to beid()
=> may result into unexpected behavior__hash__
for objects that have some immutable attributes (and use only these attributes for counting hash) and disabling__hash__
for objects that have no immutable attributes. [this can be done by deprecation warnings as mentioned before]id()
.Is good practice to have implemented custom
__hash__
together with custom__eq__
that provides results which fit intoa == b --> hash(a) == hash(b)
or disable hashing of that objects at all. I think that both py2/py3 should produce the same result and IMO is safer to disable__hash__
for objects that are mutable.The text was updated successfully, but these errors were encountered: