Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When try to assign a pd.MultiIndex to the coords, the behavior is different between "Dimension coordinate" and "Non-dimension coordinate" #5214

Closed
weipeng1999 opened this issue Apr 24, 2021 · 3 comments

Comments

@weipeng1999
Copy link

weipeng1999 commented Apr 24, 2021

Bug reports that follow these guidelines are easier to diagnose, and so are often handled much more quickly.
-->

What happened:

When I try to assign an instance of pd.MultiIndex to the coords:
the behavior of "Dimension coordinate" is to maintain the multi-index so I can use multi-index levels directly as keyword arguments,
while the behavior of "Non-dimension coordinate" is to change the index to an np.ndarray with dtype "object" , that make above function failed.

What you expected to happen:

I want the "Non-dimension coordinate" can also maintain the multi-index

Minimal Complete Verifiable Example:

>>> import xarray as xr
>>> import numpy as np
>>> import pandas as pd

# create a dataarray named 'arr'
>>> arr = xr.DataArray(np.r_[:6],{},'z')
# add coords 'z1' in dim 'z'
# this time 'z1' is a "Non-dimension coordinate"
>>> arr.coords['z1'] = 'z',pd.MultiIndex.from_product(
    ([1,2,3],['a','b']),names=['i','n'] )
# let the coords 'z1' to be the z's coords
>>> arr.set_index(z='z1')
>>> arr.sel(n='a') # Failed to use index 'n' in multiindex 'z'
ValueError: dimensions or multi-index levels ['n'] do not exist
# let's see what's coords 'z' look like
>>> arr.z
<xarray.DataArray 'z' (z: 6)>
array([(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')],
      dtype=object)
Coordinates:
  * z        (z) object (1, 'a') (1, 'b') (2, 'a') (2, 'b') (3, 'a') (3, 'b')
# why 'z' is not a MultiIndex ???

# now 'z' is a "Dimension coordinate"
# set the coords again
>>> arr1.coords['z'] = pd.MultiIndex.from_product(
    ([1,2,3],['a','b']),names=['i','n'] )
>>> arr.sel(n='a') #OK
<xarray.DataArray (i: 3)>
array([0, 2, 4])
Coordinates:
  * i        (i) int64 1 2 3
# let's see what's coords 'z' look like
>>> arr.z
<xarray.DataArray 'z' (z: 6)>
array([(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')],
      dtype=object)
Coordinates:
  * z        (z) MultiIndex
  - i        (z) int64 1 1 2 2 3 3
  - n        (z) object 'a' 'b' 'a' 'b' 'a' 'b'
# 'z' is successfully setted to a MultiIndex

Anything else we need to know?:

Environment:

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.8.2 (default, Mar 25 2020, 17:03:02)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.10.27-gentoo-dist
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: None

xarray: 0.17.0
pandas: 1.2.4
numpy: 1.20.2
scipy: 1.6.2
netCDF4: None
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.4.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: 0.9.8.5
iris: None
bottleneck: None
dask: 2021.04.0
distributed: 2021.04.0
matplotlib: 3.4.1
cartopy: 0.18.0
seaborn: 0.11.1
numbagg: None
pint: None
setuptools: 49.6.0.post20210108
pip: 21.0.1
conda: 4.10.1
pytest: None
IPython: 7.22.0
sphinx: None

@keewis
Copy link
Collaborator

keewis commented Apr 24, 2021

right now, the main difference between dimension and non-dimension coordinates is exactly that: dimension coordinates have a index while non-dimension coordinates don't. That means that before assigning the index to non-dimension coordinates it is converted to numpy.

@benbovy is currently working on changing the data model, which will probably allow the behavior you're expecting. See #1603 and #4979.

@benbovy
Copy link
Member

benbovy commented Sep 15, 2021

This is duplicate of #4791 which I just commented #4791 (comment).

@benbovy
Copy link
Member

benbovy commented Sep 15, 2021

Let's do updates in #4791 (closing this one).

@benbovy benbovy closed this as completed Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants