Skip to content

Opening Icechunk using Zarr 3.0 in a flask context #3463

@adanb13

Description

@adanb13

I am trying to open a local .icechunk via:

import xarray as xr
import logging
from pathlib import Path
logger = logging.getLogger(__name__)
import icechunk
import zarr


def open_icechunk_dataset(file_path: Path, **kwargs) -> xr.Dataset:
    """
    Load an Icechunk dataset using the Icechunk Python API,
    which returns a Zarr v3-compatible store for xarray.

    Args:
        file_path: Path to .icechunk directory
        **kwargs: Additional kwargs for xr.open_zarr (e.g. consolidated=False)
    Returns:
        xarray.Dataset
    """
    storage = icechunk.local_filesystem_storage(str(file_path))
    # Icechunk API (as per https://icechunk.io/en/latest/overview/#icechunk-overview)
    repo = icechunk.Repository.open(storage)
    print("repo:", dir(repo))
    session = repo.readonly_session("main")
    print("session: ", dir(session))
    ds = xr.open_zarr(
        session.store,
        zarr_format=3,
        consolidated=False,
        **kwargs,
    )
    print(f"DS: {ds}")
    return ds

Within a flask context I get the following:

  File "/apps/msc-ip-api-nightly/venv/lib/python3.11/site-packages/pygeoapi/icechunk/icechunk_engine.py", line 33, in open_icechunk_dataset
    ds = xr.open_zarr(
         ^^^^^^^^^^^^^
  File "/apps/msc-ip-api-nightly/venv/lib/python3.11/site-packages/xarray/backends/zarr.py", line 1436, in open_zarr
    ds = open_dataset(
         ^^^^^^^^^^^^^
  File "/apps/msc-ip-api-nightly/venv/lib/python3.11/site-packages/xarray/backends/api.py", line 670, in open_dataset
    backend_ds = backend.open_dataset(
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/apps/msc-ip-api-nightly/venv/lib/python3.11/site-packages/xarray/backends/zarr.py", line 1508, in open_dataset
    store = ZarrStore.open_group(
            ^^^^^^^^^^^^^^^^^^^^^
  File "/apps/msc-ip-api-nightly/venv/lib/python3.11/site-packages/xarray/backends/zarr.py", line 693, in open_group
    ) = _get_open_params(
        ^^^^^^^^^^^^^^^^^
  File "/apps/msc-ip-api-nightly/venv/lib/python3.11/site-packages/xarray/backends/zarr.py", line 1735, in _get_open_params
    zarr_group = zarr.open_group(store, **open_kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/msc-ip-api-nightly/venv/lib/python3.11/site-packages/zarr/api/synchronous.py", line 531, in open_group
    sync(
  File "/apps/msc-ip-api-nightly/venv/lib/python3.11/site-packages/zarr/core/sync.py", line 150, in sync
    raise SyncError("Calling sync() from within a running loop")
zarr.core.sync.SyncError: Calling sync() from within a running loop

I can confirm that the code being called (open_icechunk_dataset) works as expected in an isolated environment, not sure what is going on in this case?

Using the following: "zarr!=3.0.3,>=3"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions