-
-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Include some tests of store equality #357
WIP: Include some tests of store equality #357
Conversation
Thanks @jakirkham. FWIW we might need to think a bit about what it should mean for two stores to compare equal. Currently, for example, two |
So I'm pretty sure that the |
OK, Struggling to find where |
Reviewing the current tests, I think the current implementations of Is there a reason for considering a different behaviour for the store |
FWIW if we switch to using DictStore as the default store, then #348 should go away. I.e., we will get consistent behaviour for array equals comparison across all stores, which is essentially that two arrays will compare equal iff they share the same storage. |
Hello @jakirkham! Thanks for updating the PR.
Comment last updated on February 13, 2019 at 08:41 Hours UTC |
Provide a few tests that check to see if different stores compare equal or not.
As the check against `root` will turn into a recursive check throughout the `DictStore`'s contents, make it the last thing checked. That we if any of the fast checks invalidate equality, we can shortcut the content check.
Make sure to compare `DirectoryStore`'s contents if their paths differ instead of just returning `False`.
Instead of trying to implement a custom `__eq__` for `NestedDirectoryStore`, fallback to the `DirectoryStore` implementation once we have verified they are the same type.
Ensure that if the two `ZipStore`s are in two different places, we still check their contents.
Make sure that if nothing simple automatically fails or passes the `__eq__` that we fallback to comparing the contents of the two stores.
This reverts commit 9d01fb6.
This reverts commit d481d9d.
This reverts commit f6c7fd4.
This reverts commit d425945.
The equality test of store note only requires that stores have the same content, but that they actually be views onto the same data. Hence this test was previously incorrect. So we now force this case of two stores with identical data to be not equal.
As the `==`s comparison here results in recursive check through the `DictStore`'s contents and we want to test identity with `__eq__` instead, ensure that the two `root`'s are in fact the same object as opposed to testing their equality.
Ensure that two stores are only equal if they view the same data.
Make sure that equality means they view the same data.
So I've made some changes inline with the intent. There will still be a failure for the |
I'm not quite following this. Wouldn't we want to ensure that two Arrays have the same content regardless of where they are stored? Would also be curious why wouldn't we want to compare Groups' contents as well. Thoughts? |
On Wed, 13 Feb 2019 at 08:47, jakirkham ***@***.***> wrote:
So I'm pretty sure that the __eq__() implementations on the store classes
are not currently tested directly, however they will be used inside the
Array.__eq__() implementation
<https://github.com/zarr-developers/zarr/blob/master/zarr/core.py#L416>
and in the Group.__eq__() implementation
<https://github.com/zarr-developers/zarr/blob/master/zarr/hierarchy.py#L184>
.
I'm not quite following this. Wouldn't we want to ensure that two Arrays
have the same content regardless of where they are stored?
The tests I linked to currently use __eq__ to verify that two arrays draw
their data from the same store (or two groups draw their data from the same
store). They are not trying to compare the actual data. Part of that
behaviour involves testing that two store objects are "equal" in the sense
that they both draw their data from the same source (e.g., the same
directory on a file system). Again that doesn't involve comparing any
actual data. Those tests all need some way of making this kind of
comparison, although perhaps __eq__ is not the right method for it.
I could imagine in other situations you might want to compare the actual
data contents of two arrays (or two groups or two stores) and return true
if the data are the same, regardless if drawn from the same data source or
not. That's a separate requirement. Maybe __eq__ should behave like that.
In this case, should zarr arrays behave like numpy arrays, and return an
array of boolean values with broadcasting etc.?
Bottom line, I think there are two separate requirements for arrays: (1)
compare two arrays and return True if they are "the same" in the sense that
they both draw their data from the same source; and (2) compare the data
contents of two arrays and return something.
Similarly, I think there are two separate requirements for groups: (1)
return True if two groups draw their data from the same source; (2) compare
the contents of two groups.
Similarly for stores: (1) return True if two store objects draw their data
from the same underlying source; (2) compare the contents of two stores.
Now the question is, what should __eq__ do? And what other methods do we
need to cover any remaining requirements not covered by __eq__?
|
Provide a few tests that check to see if different stores compare equal or not.
Note: Currently this fails for a few stores. Hopefully this will be a good point to debug and fix those.
TODO:
tox -e docs
)