Skip to content

Conversation

@keewis
Copy link
Collaborator

@keewis keewis commented Oct 27, 2025

  • Closes #xxxx
  • Tests added
  • User visible changes (including notable bug fixes) are documented in whats-new.rst

Building on top of zarr-developers/zarr-python#3534, this is a draft PR that allows writing variable-sized chunks to zarr.

To see this in action, try:

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "xarray @ git+https://github.com/keewis/xarray.git@variable-chunking",
#   "zarr @ git+https://github.com/jhamman/zarr-python.git@feature/rectilinear-chunk-grid",
# ]
# ///

import numpy as np
import xarray as xr

rng = np.random.default_rng(seed=0)
values = rng.normal(size=(365, 20))

ds = xr.Dataset(
    {"a": (["time", "x"], values)},
    coords={"time": xr.date_range("2025-01-01", freq="d", periods=365)}
)
chunked = ds.chunk({"time": xr.groupers.TimeResampler(freq="ME"), "x": 10})

chunked.to_zarr(
    "variable_chunks.zarr",
    mode="w",
    safe_chunks=False,
    zarr_format=3,
    consolidated=False,
)

ds = xr.open_dataset(store, engine="zarr", chunks={})
print(ds.chunksizes)
# Frozen({'time': (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31), 'x': (10, 10)})

At the moment, this requires safe_chunks=False because I didn't change the chunk alignment machinery, yet.

cc @d-v-b, @jhamman, @dcherian

@github-actions github-actions bot added topic-backends topic-zarr Related to zarr storage library io labels Oct 27, 2025
# while dask chunks can be variable sized
# https://dask.pydata.org/en/latest/array-design.html#chunks
if var_chunks and not enc_chunks:
if zarr_format == 3:
Copy link
Collaborator Author

@keewis keewis Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check is probably not sufficient

@keewis keewis marked this pull request as draft October 27, 2025 16:34
@jhamman
Copy link
Member

jhamman commented Oct 27, 2025

We need zarr-python>=3, which doesn't work with @jhamman's fork because it doesn't have tags for versions above 3.0.0b2

I just pushed tags to my fork!

@keewis
Copy link
Collaborator Author

keewis commented Oct 27, 2025

thanks, I've changed the example back to using your fork

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

io topic-backends topic-zarr Related to zarr storage library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants