Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assigning non-dimension coordinates does not conserve MultiIndex type #4791

Open
carlos-rpg opened this issue Jan 11, 2021 · 3 comments
Open

Comments

@carlos-rpg
Copy link

What happened:
Assigning a new non-dimension coordinate of type MultiIndex to a Dataset/DataArray returns an object that has the new coordinate as type Object instead of MultiIndex.

What you expected to happen:
Adding a MultiIndex as a dimension coordinate to a Dataset keeps the type. It should be the same for non-dimension.

Minimal Complete Verifiable Example:

import numpy as np
import pandas as pd
import xarray as xr

ds = xr.Dataset(
    data_vars={'foo': (['x', 'y'], np.random.rand(2, 2))},
    coords={'x': [1, 2], 'y': [1, 2]}
)
idx = pd.MultiIndex.from_arrays([['a', 'b'], [5, 6]])

ds.assign_coords({'z': idx}) # Conserves MultiIndex type
ds.assign_coords({'z': ('x', idx)}) # Does not conserve MultiIndex type

Anything else we need to know?:

Environment:

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None
libhdf5: 1.10.4
libnetcdf: 4.7.3

xarray: 0.16.1
pandas: 1.1.3
numpy: 1.19.1
scipy: 1.5.2
netCDF4: 1.5.3
pydap: installed
h5netcdf: None
h5py: None
Nio: None
zarr: 2.6.1
cftime: 1.3.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.30.0
distributed: 2.30.1
matplotlib: 3.3.3
cartopy: 0.18.0
seaborn: 0.11.0
numbagg: None
pint: None
setuptools: 50.3.0.post20201006
pip: 20.2.4
conda: 4.9.2
pytest: None
IPython: 7.18.1
sphinx: 3.2.1

@keewis
Copy link
Collaborator

keewis commented Jan 11, 2021

unfortunately, that's expected behavior right now. We might change that with the index refactor (see #1603), but until then it is not possible to have an index on non-dimension coordinates (that's one of the differences between dimension coordinates and non-dimension coordinates).

@benbovy
Copy link
Member

benbovy commented Sep 15, 2021

Note that during the index refactor we will probably depreciate the implicit creation of multi-index level coordinates when assigning a pandas MultiIndex as a new coordinate, so eventually both ds.assign_coords({'z': idx}) and ds.assign_coords({'z': ('x', idx)}) will treat idx as a non-indexed 1-dimensional array with tuple (index levels) values.

At the same time we should provide alternative ways of reusing existing pandas multi-indexes more explicitly in xarray.

@benbovy
Copy link
Member

benbovy commented Aug 23, 2023

It is now possible with the last release v2023.8.0 to assign coordinates created from a pd.MultiIndex like this (old way via plain dict is going to be deprecated):

ds = xr.Dataset(
    data_vars={'foo': (['x', 'y'], np.random.rand(2, 2))},
    coords={'x': [1, 2], 'y': [1, 2]}
)

idx = pd.MultiIndex.from_arrays([['a', 'b'], [5, 6]])
idx_coords = xr.Coordinates.from_pandas_multiindex(idx, "x")

ds.assign_coords(idx_coords)

This will overwrite the pre-existing ds.x coordinate, though (as expected once #8094 is merged). Until we drop the dimension coordinate created for any PandasMultiIndex, it will not be possible to have both a PandasMultiIndex and another dimension coordinate along the same dimension.

The following is possible today, though (PandasMultiIndex + indexed non-dimension coordinate):

ds.rename_vars(x="x_alt").assign_coords(idx_coords)

ds.assign_coords({'z': ('x', idx)}) doesn't make much sense as "z" does not correspond to either the dimension name or a level of the multi-index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants