You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you stack a bunch of items into an xarray (say, 10,000 scenes covering all of CONUS), then spatially slice out just a tiny area, your time dimension will still contain all 10k items, even though the majority of those items probably don't intersect your AOI and will be all NaNs when loaded. Besides bloated Dask graphs, this isn't that big of a performance deal (we have fastpath logic for this case), but it's annoying from a UX perspective: it would be nice to know how many actual scenes you have without computing anything.
If we store the items' bounding-boxes as a coordinate variable (along the time dim), we could then easily drop any items that don't overlap. There could either/both be a convenience function to do the spatial indexing for you, or a stackstac.drop_non_overlapping function, which looks at the current bounds of a DataArray (based on min/max of its x and y dims) and drops items that fall outside those).
The only annoying thing is we'll need to force NumPy to make a 1d array of 4-tuples (object dtype), since xarray won't allow us to have a coordinate variable with extraneous dimensions. Or maybe a record array could work?
The text was updated successfully, but these errors were encountered:
If you stack a bunch of items into an xarray (say, 10,000 scenes covering all of CONUS), then spatially slice out just a tiny area, your
time
dimension will still contain all 10k items, even though the majority of those items probably don't intersect your AOI and will be all NaNs when loaded. Besides bloated Dask graphs, this isn't that big of a performance deal (we have fastpath logic for this case), but it's annoying from a UX perspective: it would be nice to know how many actual scenes you have without computing anything.If we store the items' bounding-boxes as a coordinate variable (along the
time
dim), we could then easily drop any items that don't overlap. There could either/both be a convenience function to do the spatial indexing for you, or astackstac.drop_non_overlapping
function, which looks at the current bounds of a DataArray (based on min/max of its x and y dims) and drops items that fall outside those).The only annoying thing is we'll need to force NumPy to make a 1d array of 4-tuples (
object
dtype), since xarray won't allow us to have a coordinate variable with extraneous dimensions. Or maybe a record array could work?The text was updated successfully, but these errors were encountered: