You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be useful (especially for testing purposes) to be able to compare xarray datasets containing ManifestArrays for equality.
It's clear when two ManifestArray objects should be considered equivalent - they have the exact same chunkmanifest and zarray info. However, just simply returning this condition from .__eq__ has a few issues:
The array API standard expects .__eq__ to return an entire array of booleans rather than a single scalar. I could make that array, but it seems like a weird thing to do, as you can't actually access the individual elements (only indirect representations of individual chunks). Actually, I could even go further and define equality on a per-chunk basis, returning a numpy array of booleans that would have False at any locations corresponding to non-equivalent ChunkEntry objects in the manifest... But you can't then use this mask to index the array. It also seems like the array API standard expects the returned array type to be Self, i.e. another ManifestArray, not np.ndarray.
Xarray checks equivalence of wrapped arrays in a complicated lazy NaN-aware way that gets tripped up by ManifestArray objects. Following the stack down from xarray.testing.assert_equal, it goes through Variable.equals, duck_array_ops.array_equiv, and calls duck_array_ops.isnull. At this point as ManifestArray doesn't define any form of isnull, an AttributeError is raised, which is caught in Variable.equals here, and False is always returned as the result of the equality assertion.
If this condition were just (arr1 == arr2) then we would be fine...
Inside duck_array_ops.isnull theres a weird part that I don't understand. Copied verbatim:
scalar_type=data.dtype.type
...
elifissubclass(scalar_type, (np.bool_, np.integer, np.character, np.void)):
# these types cannot represent missing valuesreturnfull_like(data, dtype=bool, fill_value=False)
else:
# at this point, array should have dtype=objectifisinstance(data, np.ndarray):
returnpandas_isnull(data)
else:
# Not reachable yet, but intended for use with other duck array# types. For full consistency with pandas, we should accept None as# a null value as well as NaN, but it isn't clear how to do this# with duck typing.returndata!=data
This is going to behave differently for float vs integer dtypes. To support both of these cases, we presumably want to implement both full_like and isnan, and have them both return arrays of False on all ManifestArrays...
The text was updated successfully, but these errors were encountered:
It would be useful (especially for testing purposes) to be able to compare xarray datasets containing ManifestArrays for equality.
It's clear when two
ManifestArray
objects should be considered equivalent - they have the exact same chunkmanifest and zarray info. However, just simply returning this condition from.__eq__
has a few issues:The array API standard expects
.__eq__
to return an entire array of booleans rather than a single scalar. I could make that array, but it seems like a weird thing to do, as you can't actually access the individual elements (only indirect representations of individual chunks). Actually, I could even go further and define equality on a per-chunk basis, returning a numpy array of booleans that would haveFalse
at any locations corresponding to non-equivalentChunkEntry
objects in the manifest... But you can't then use this mask to index the array. It also seems like the array API standard expects the returned array type to beSelf
, i.e. anotherManifestArray
, notnp.ndarray
.Xarray checks equivalence of wrapped arrays in a complicated lazy NaN-aware way that gets tripped up by
ManifestArray
objects. Following the stack down fromxarray.testing.assert_equal
, it goes throughVariable.equals
,duck_array_ops.array_equiv
, and callsduck_array_ops.isnull
. At this point asManifestArray
doesn't define any form ofisnull
, anAttributeError
is raised, which is caught inVariable.equals
here, andFalse
is always returned as the result of the equality assertion.The problematic section is
If this condition were just
(arr1 == arr2)
then we would be fine...Inside
duck_array_ops.isnull
theres a weird part that I don't understand. Copied verbatim:This is going to behave differently for float vs integer dtypes. To support both of these cases, we presumably want to implement both
full_like
andisnan
, and have them both return arrays ofFalse
on allManifestArray
s...The text was updated successfully, but these errors were encountered: