Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changes made to coords using groupby and apply do not persist #1428

Closed
d-chambers opened this issue May 26, 2017 · 2 comments
Closed

changes made to coords using groupby and apply do not persist #1428

d-chambers opened this issue May 26, 2017 · 2 comments

Comments

@d-chambers
Copy link

I am running Ubuntu 16 with Xarray 0.9.1 on python 3.6.0.

I have found that any changes made to coordinates in a function that is called by a groupby object's apply method do not persist. The following code illustrates the problem:

import numpy as np
import xarray as xr


def change_new_coord(dar):
    """ change the new_coord coord from 1 to 0 """
    dar.coords['new_coord'] = 0
    return dar


# setup data array
data = np.ones((10, 10, 1000))
time = np.linspace(0, 10, 1000)
coords = {'time': time, 'd2': range(10), 'd3': range(10)}
dims = ['d2', 'd3', 'time']
dar = xr.DataArray(data, coords=coords, dims=dims)

# attach coordinate based on d2 and d3
dar.coords['new_coord'] = (('d2', 'd3'), np.ones((10, 10)))

# stack
stacked = dar.stack(z=('d2', 'd3'))

# groupby
gr = stacked.groupby('z')

# apply
out = gr.apply(change_new_coord).unstack('z')

# raises; all values in new_coord should be 0, but they are still 1
assert np.all(out.coords['new_coord'] == 0)
@keewis
Copy link
Collaborator

keewis commented Mar 29, 2020

This is a special case for a coordinate along the groupby dimension: since for each group the groupby dimension coordinate is a scalar, it is not treated as a dimension. This means new_coord is 0D and thus won't be concatenated (resulting in a Dataset with a correct z, but a 0D new_coord). To make up for that, the code adds back the dimensions not in a group (z) and along with it all coords along that dimension, overwriting the 0D new_coord with the original 1D new_coord.

If we'd want to do something about that, I guess we'd need to modify the concatenation code to know about the groupby dimension and its coords?

As a workaround, we can modify the original coord (note the [...]):

def change_new_coord(dar):
    """ change the new_coord coord from 1 to 0 """
    dar.coords['new_coord'][...] = 0
    return dar

@max-sixty
Copy link
Collaborator

The repro now passes! So closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants