Open
Description
MCVE Code Sample
import xarray as xr
import numpy as np
sample_idx = xr.IndexVariable("sample_id", ["a", "b", "c"])
da = xr.DataArray(np.eye(3), coords=(sample_idx, sample_idx))
da.shape
# (3, 3)
da[1, :].shape
# (3, 3)
da.loc["a", :].shape
# (3, 3)
da.loc[:, "a"].shape
# ()
da[:, 0].shape
# ()
da[:, 1]
# <xarray.DataArray ()>
# array(1.)
# Coordinates:
# sample_id <U1 'b'
Expected Output
I had expected:
da.shape
# (3, 3)
da[1, :].shape
# (3)
da.loc["a", :].shape
# (3)
da.loc[:, "a"].shape
# (3)
da[:, 1]
# <xarray.DataArray (sample_id: 3)>
# array([0., 1., 0.])
# Coordinates:
# sample_id <U1 'a' 'b' 'c'
Problem Description
When coordinates are shared between dimensions (as would happen if a pairwise measurement is taken) indexing behaves strangely. It looks like indexing into the initial indices doesn't do anything, while indexing into the last index applies the selection across all dimensions.
da3d = xr.DataArray(
np.arange(27).reshape((3,3,3)),
coords=(sample_idx, sample_idx, sample_idx)
)
print(da3d.loc["a"].shape)
print(da3d.loc["a", "a"].shape)
print(da3d.loc[:, :, "a"].shape)
# (3, 3, 3)
# (3, 3, 3)
# ()
Output of xr.show_versions()
# Paste the output here xr.show_versions() here
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.6 (default, Jan 4 2020, 12:18:30)
[Clang 11.0.0 (clang-1100.0.33.16)]
python-bits: 64
OS: Darwin
OS-release: 19.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.2
libnetcdf: 4.6.3
xarray: 0.14.1+5.gb0064b25
pandas: 0.25.3
numpy: 1.18.1
scipy: 1.4.1
netCDF4: 1.5.2
pydap: None
h5netcdf: 0.7.4
h5py: 2.9.0
Nio: None
zarr: 2.4.0
cftime: 1.0.3.4
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.9.2
distributed: 2.9.3
matplotlib: 3.0.3
cartopy: None
seaborn: 0.10.0
numbagg: None
setuptools: 44.0.0
pip: 20.0.2
conda: None
pytest: 5.3.4
IPython: 7.11.1
sphinx: 2.3.1
Update, adding link to related issue: