-
-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exceptions when creating an array that has an object #806
Comments
@abergou: and just to confirm, does passing |
Thanks for the quick reply @joshmoore! Where do I pass in >>> import numcodecs
>>> import numpy
>>> import zarr
>>> foo = zarr.open('foo')
>>> foo.create('bar', dtype=[('x', float), ('y',object)], shape=(10, 20), object_codec=numcodecs.Pickle(), data=numpy.zeros((10, 20), dtype=[('x', float), ('y', object)]))
TypeError Traceback (most recent call last)
...
MetadataError: error decoding metadata: Cannot change data-type for object array. Quick note: I fixed up the second reproducer I had above. I had a copy paste error that I didn't notice before. |
This was an example that was working for me: import numpy as np
import zarr
t = [
("label-value", int),
("r", int),
("g", int),
("b", int),
("a", int),
("object-type", "U20"),
("object-id", int),
("description", "U200")]
data = list()
for x in range(100):
data.append((1, 100, 100, 100, 100, "Mask", 123456, "some text here"))
data.append((2, 200, 200, 200, 200, "Mask", 567896, "some more text"))
a = np.array(data, dtype=t)
z = zarr.open("s.zarr")
z.array(name="a", data=a, chunks=(10,)) |
Ah thanks! I missed the >>> import numcodecs
>>> import numpy
>>> import zarr
>>> data = numpy.zeros((10, 20), dtype=[('x', float), ('y',object)])
>>> foo = zarr.open('foo')
>>> foo.array('bar', data, object_codec=numcodecs.Pickle())
MetadataError Traceback (most recent call last)
...
MetadataError: error decoding metadata: Cannot change data-type for object array. |
One more issue that is also related to the above: for some object arrays zarr can silently change the object type in the array: >>> import collections
>>> import numcodecs
>>> import zarr
>>> x = zarr.open('x')
>>> y = x.create('y', shape=(2, 2), dtype='O', fill_value=collections.Counter(), object_codec=numcodecs.Pickle())
>>> y[0, 0]
{}
>>> type(y[0, 0])
dict |
@joshmoore I actually have a patch for this that I'll submit a pull request for imminently. |
* Ensures that the fill value of structured arrays that contain objects is encoded using object_codec.
💯 |
* Fix structured arrays that contain objects #806 * Ensures that the fill value of structured arrays that contain objects is encoded using object_codec. * Add test and fix-up to ensure compatibility * Update docs/release.rst * Fixup unit testss Don't specify protocol: makes unit tests pass in python3.7 N5 doesn't support object codecs * Fixup linting error Explicitly handle an error condition that can only happen if encode_fill_value or decode_fill_value are directly called. * Add encode/decode tests for codecov * Explicitly import Pickle from numcodecs for mypy * Migrate test from #702 With thanks to @ombschervister * Install types-setuptools for CI Co-authored-by: Attila Bergou <attila@alumni.cmu.edu> Co-authored-by: Josh Moore <j.a.moore@dundee.ac.uk> Co-authored-by: jmoore <josh@glencoesoftware.com>
Closed by #813 (v2.9.4) |
Thanks @joshmoore ! |
I noticed two issues:
I think that the issue is in the functions
encode_fill_value
anddecode_fill_value
. A structured dtype that contains an object reports its kind as 'V' so zarr encodes it using standard_b64encode, but ifdtype.has_object
is true then it should first pickle the fill_value and only then encode it.None
for object arrays:zarr.__version__
: 2.8.3numcodecs.__version__
: 0.6.4 -- 0.8.0Edit: I cleaned up the second example (I copied and pasted an incorrect reproducer here).
The text was updated successfully, but these errors were encountered: