-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent behavior in grouby depending on the dimension order #5361
Labels
Comments
That's surprising indeed. I confirmed the bug was present in 0.17.0 It also seems unrelated to uniqueness in the non-grouped-by dimension: In [7]: data['y'].values = [0,1,2,3]
In [8]: data['y'].values
Out[8]: array([0, 1, 2, 3])
In [9]: data
Out[9]:
<xarray.DataArray (x: 4, z: 2)>
array([[ 0.13156972, 0.13986012],
[ 1.61815504, 0.11421297],
[ 0.15819393, -0.5183183 ],
[ 0.30672251, 0.34373302]])
Coordinates:
* x (x) <U1 'a' 'b' 'a' 'c'
y (x) int64 0 1 2 3
Dimensions without coordinates: z
In [10]: data.T.groupby('x').mean()
Out[10]:
<xarray.DataArray (z: 2, x: 3)>
array([[ 0.14488182, 1.61815504, 0.30672251],
[-0.18922909, 0.11421297, 0.34373302]])
Coordinates:
* x (x) object 'a' 'b' 'c'
y (x) int64 0 1 2 3 # <-- the size must be 3!!
Dimensions without coordinates: z |
Update: this is still an issue, though raises an error rather than returning a corrupt object:
Quite surprising... |
works on main and latest release for me import xarray as xr
import numpy as np
data = xr.DataArray(
np.random.randn(4, 2),
dims=['x', 'z'],
coords={'x': ['a', 'b', 'a', 'c'], 'y': ('x', [0, 1, 0, 2])}
)
data.T.groupby('x').mean() # drops y |
Ah — with flox installed it works ("Parents: tell your kids to use flox!") |
lol, my bad. |
dcherian
added a commit
to dcherian/xarray
that referenced
this issue
Sep 18, 2024
dcherian
added a commit
that referenced
this issue
Sep 19, 2024
* Make _replace more lenient. Closes #5361 * review comments
hollymandel
pushed a commit
to hollymandel/xarray
that referenced
this issue
Sep 23, 2024
* Make _replace more lenient. Closes pydata#5361 * review comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
groupby
works inconsistently depending on the dimension order of aDataArray
.Furthermore, in some cases, this causes a corrupted object.
groupby
works fine (although this drops nondimensional coordinatey
, related to #3745).However,
groupby
does not give a correct result if we work on the second dimension,The bug has been discussed in #2944 and solved, but I found this is still there.
Output of xr.show_versions()
INSTALLED VERSIONS
commit: 09d8a4a
python: 3.7.7 (default, Mar 23 2020, 22:36:06)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-72-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.10.4
libnetcdf: 4.6.1
xarray: 0.16.1.dev30+g1d3dee08.d20200808
pandas: 1.1.3
numpy: 1.18.1
scipy: 1.5.2
netCDF4: 1.4.2
pydap: None
h5netcdf: 0.8.0
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.6.0
distributed: 2.7.0
matplotlib: 3.2.2
cartopy: None
seaborn: 0.10.1
numbagg: None
pint: None
setuptools: 46.1.1.post20200323
pip: 20.0.2
conda: None
pytest: 5.2.1
IPython: 7.13.0
sphinx: None
The text was updated successfully, but these errors were encountered: