Skip to content

DataArray creation prone to errors when data shares dimension shapes #727

Closed
@choldgraf

Description

@choldgraf

It seems like there would be unexpected behavior whenever someone creates a DataArray using data with some subset of dimensions that have the same shape, and when supplying coords as a dictionary of coord_name: coords pairs. Since the dictionary isn't ordered, won't it be unclear which dimension it is referring to?

Maybe that's not such a good description, here's an example:

data = np.random.randn(20, 5, 5)
da = xarray.DataArray(data, {'1': np.arange(20),
                             '2': np.arange(15, 20),
                             '3': np.arange(5, 10)})
print(da.coords)

da = xarray.DataArray(data, {'10': np.arange(20),
                             '11': np.arange(15, 20),
                             '12': np.arange(5, 10)})
print(da.coords)

For the first dimension it's no problem, since it's the only one with length 20. For the 2nd and 3rd dimensions since they are the same shape, the coordinates will be assigned depending on which order python takes the keys, which means they could sometimes be flipped depending on something arbitrary like what you named a dimension.

It seems like this could be addressed by doing a quick check (when supplying coordinates in this fashion) to see if any of the dimensions of the coordinates are the same, and throwing a warning that behavior might be unstable. Or maybe it throws an error and says "if data coordinates are same length, please supply coordinate values in a tuple or list instead"?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions