Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide better support for the rechunking use case #101

Open
forman opened this issue Sep 13, 2024 · 0 comments
Open

Provide better support for the rechunking use case #101

forman opened this issue Sep 13, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@forman
Copy link
Member

forman commented Sep 13, 2024

Is your feature request related to a problem? Please describe.

zappend is often used to rechunk a potentially large data cube. The use case here is, that the slices originate from an existing Zarr cube. In other use cases the slices are individual datasets, e.g., from NetCDF files or GeoTIFFs.

For new users it is not obvious how to configure zappend for the rechunking use case.

Describe the solution you'd like

  • Describe rechunking in the user guide and in the "how do I" section.
  • Provide helper generator class or function that splits a source dataset into slices so that its result can be passed as 1st argument to zappend.

Helper class example:

class DatasetSlices:
    def __init__(self, ds: xr.Dataset, time_index: int = 0):
        self.ds = ds
        self.time_size = ds.time.size
        self.time_index = time_index
        
    def __next__(self):
        if self.time_index >= self.time_size:
            raise StopIteration()
        ds = self.ds
        time_index = self.time_index
        slice_ds = ds.isel(time=slice(time_index, time_index+1))
        self.time_index += 1
        return slice_ds
        
    def __iter__(self):
        return self

Example usage:

source_ds = xr.open_zarr(source_path)
zappend(DatasetSlices(source_ds), target_dir=target_path, ...)
@forman forman added the enhancement New feature or request label Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant