-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: cannot chunk a DataArray that originated as a coordinate #6204
Comments
I've run in to this before. The underlying variable object is xarray/xarray/core/variable.py Lines 2707 to 2709 in 95bb9ae
|
…#1542) <!--Please ensure the PR fulfills the following requirements! --> <!-- If this is your first PR, make sure to add your details to the AUTHORS.rst! --> ### Pull Request Checklist: - [x] This PR addresses an already opened issue (for bug fixes / features) - This PR fixes #1536 - [x] Tests for the changes have been added (for bug fixes / features) - [ ] (If applicable) Documentation has been added / updated (for bug fixes / features) - [x] CHANGES.rst has been updated (with summary of main changes) - [x] Link to issue (:issue:`number`) and pull request (:pull:`number`) has been added ### What kind of change does this PR introduce? * New function `xc.core.utils._chunk_like` to chunk a list of inputs according to one chunk dictionary. It also circumvents pydata/xarray#6204 by recreating DataArrays that where obtained from dimension coordinates. * Generalization of `uses_dask` so it can accept a list of objects. * Usage of `_chunk_like` to ensure the inputs of `cosine_of_solar_zenith_angle` are chunked when needed, in `mean_radiant_temperature` and `potential_evapotranspiration`. The effect of this is simply that the `cosine_of_solar_zenith_angle` will be performed on blocks of the same size as in the original data, even though its inputs (the dimension coordinate) did not carry that information. Before this PR, the calculation was done as a single block, of the same size as the full array. ### Does this PR introduce a breaking change? No. ### Other information: Dask might warn something like `PerformanceWarning: Increasing number of chunks by factor of NN`. Where NN should be the number of chunks along the `lat` dimension, if present. That exactly what we want, so it's ok.
I encountered a similar issue when I tried to create a new array from a set of Index Coordinates, e.g., # ds.z, ds.y, ds.x are fully loaded into memory, and so is ds.coords.r.
ds.coords['r'] = np.sqrt((ds.z - z0)**2 + (ds.y - y0)**2 + (ds.x - x0)**2) I'm currently circumventing the problem by using x, y, z = _chunk_like(ds.x, ds.y, ds.z, chunks=ds.chunksizes)
ds.coords['r'] = np.sqrt((z - z0)**2 + (y - y0)**2 + (x - x0)**2)
# Now, ds.coords.r carries a dask array! As @dcherian noted, the underlying cause of the issue seems to be that the |
What happened?
If I construct the following DataArray, and try to chunk its
"x"
coordinate, I get back a NumPy-backed DataArray:If I construct a copy of the
"x"
coordinate, things work as I would expect:What did you expect to happen?
I would expect the following to happen:
Minimal Complete Verifiable Example
No response
Relevant log output
No response
Anything else we need to know?
No response
Environment
INSTALLED VERSIONS
commit: None
python: 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 15:59:12)
[Clang 11.0.1 ]
python-bits: 64
OS: Darwin
OS-release: 21.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.10.5
libnetcdf: 4.6.3
xarray: 0.20.1
pandas: 1.3.5
numpy: 1.19.4
scipy: 1.5.4
netCDF4: 1.5.5
pydap: None
h5netcdf: 0.8.1
h5py: 2.10.0
Nio: None
zarr: 2.7.0
cftime: 1.2.1
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.22.0
distributed: None
matplotlib: 3.2.2
cartopy: 0.19.0.post1
seaborn: None
numbagg: None
fsspec: 2021.06.0
cupy: None
pint: 0.15
sparse: None
setuptools: 49.6.0.post20210108
pip: 20.2.4
conda: 4.10.1
pytest: 6.0.1
IPython: 7.27.0
sphinx: 3.2.1
The text was updated successfully, but these errors were encountered: