-
-
Notifications
You must be signed in to change notification settings - Fork 378
Description
Zarr version
3.1.3
Numcodecs version
0.16.3
Python Version
3.13.4
Operating System
Windows 11 24H2
Installation
using pip into virtual environment
Description
I am encountering an issue when writing to a Zarr array using Python multiprocessing. (Or more specifically: Missuse Dask's Multiprocessing, as i write into a zarr array inside a map_block function)
The problem occurs when the array chunks contain multiple elements. Writing works correctly
when each chunk contains only a single element, but fails when the chunk size is larger.
Example:
Here group is a zarr group and zarr_array is another zarr array.
My code that writes to b_arr in a multprocessed manner works, when this creates the array:
b_arr=group.create_array(
name=name,
shape=zarr_array.cdata_shape,
chunks=(1, 1),
dtype=np.uint64,
fill_value=0,
)
and here not:
b_arr=group.create_array(
name=name,
shape=zarr_array.cdata_shape,
chunks=zarr_array.cdata_shape,
dtype=np.uint64,
fill_value=0,
)
I believe that zarr should work when mutliple processes concurrenty write into an array, so i consider it a bug.
Steps to reproduce
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "zarr@git+https://github.com/zarr-developers/zarr-python.git@main",
# ]
# ///
#
# This script automatically imports the development branch of zarr to check for issues
import zarr
# your reproducer code
# zarr.print_debug_info()Additional output
No response