Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch NDArrayImage dimension order to (band, y, x) #32

Merged
merged 4 commits into from
Aug 6, 2024
Merged

Conversation

aazuspan
Copy link
Contributor

Closes #29 by making NDArrayImage use shape (band, y, x), which is consistent with multi-band images loaded by rasterio or rioxarray. Dataset loading and tests were updated to work with the new dimension order.

It's worth noting that internally, _ImageChunk still handles images in (y, x, band) order because xr.apply_ufunc always moves the core dimension (band) to the last axis. This means that NDArrayImage also has to transpose arrays before they go into the ufunc, and both image types need to transpose the ufunc outputs back to (band, y, x).

To reduce some duplication introduced by that change, I refactored apply_ufunc_across_bands from the Image subclasses into the base class, with the type-specific functionality now handled by preprocessing and postprocessing functions. This means that both image types are now handled identically by xr.apply_ufunc, which further simplifies things. It does feel a little weird calling xr.apply_ufunc with Numpy arrays, but it's a documented feature and I'm sure the overhead of passing it through is negligible.

There was a decent amount of duplication between Image subclasses
in handling ufuncs and postprocessing results that might be tuples.
To simplify things, apply_ufunc_across_bands is now defined by
Image, with preprocessing and postprocessing handled by subclasses.
This switch means that Numpy arrays are now handled through
xr.apply_ufunc, which is supported by xarray even though it's
slightly odd.
@aazuspan aazuspan added the enhancement New feature or request label Jul 26, 2024
@aazuspan aazuspan requested a review from grovduck July 26, 2024 22:21
@aazuspan aazuspan self-assigned this Jul 26, 2024
Copy link
Member

@grovduck grovduck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aazuspan, this looks really good to me. Thanks for the expanded documentation about expected shapes, which should be helpful for the user. I have one minor typing question, but otherwise this looks good to me.

with rasterio.open(path) as src:
band = src.read(1)
arr = band if arr is None else np.dstack((arr, band))
if arr is None:
arr = np.empty((len(file_paths), *band.shape), dtype=band.dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any possibility we're going to run into typing issues if the first image is, say, uint8 and subsequent bands are float? Previously, I think np.dstack would take care of that automatic promotion to float, but this may not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're absolutely right! It looks like this will down-cast subsequent band arrays to make them fit in the arr dtype, so floats would get rounded. That would've been a tough bug to track down!

I rewrote this closer to the original but using concatenate instead of dstack, and added a test to make sure that loading an int and float raster gives you a float raster back.

Copy link
Member

@grovduck grovduck Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! And of course you went the extra step and wrote an awesome test for this as well! OK to merge ...

src/sknnr_spatial/image.py Show resolved Hide resolved
@aazuspan
Copy link
Contributor Author

aazuspan commented Aug 6, 2024

Thanks for the review @grovduck! Good catch on the dtype issue!

@aazuspan aazuspan merged commit 20cc387 into main Aug 6, 2024
5 checks passed
@aazuspan aazuspan deleted the dim-order branch August 6, 2024 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make dimension order consistent with rasterio
2 participants