-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removing dimensions from Dataset objects #1949
Comments
I think SO is the best place for user Qs, so the answers can be searchable for future generations. To respond immediately though, have you tried In [1]: import xarray as xr
In [2]: test_dataset = xr.Dataset(dict(
...: empty_array=xr.DataArray([], dims='a'),
...: populated_array=xr.DataArray([1], {'b':['1']}, 'b')
...: ))
In [3]: test_dataset
Out[3]:
<xarray.Dataset>
Dimensions: (a: 0, b: 1)
Coordinates:
* b (b) <U1 '1'
Dimensions without coordinates: a
Data variables:
empty_array (a) float64
populated_array (b) int64 1
In [4]: test_dataset.squeeze()
Out[4]:
<xarray.Dataset>
Dimensions: (a: 0)
Coordinates:
b <U1 '1'
Dimensions without coordinates: a
Data variables:
empty_array (a) float64
populated_array int64 1 |
I don't think it's actually possible to purge |
Hmmm, this is harder than I originally expected. I imagine someone will comment with an easy solution, otherwise I'll have another look |
If you're OK creating a new Dataset, it works to remove any variables using a dimension, e.g.,
You're right that this doesn't work to remove dimensions from existing datasets (e.g., with
This used to be possible in the xarray data model prior to v0.9.0. When we made coordinates optional, I updated I'd like to suggest two possible fixes:
|
The drop technique seems reasonable, if a bit long-winded for the programmatic case (loop over all dimensions, find any that are empty -> loop over all variables, drop any that contain those empty dimensions). As an addition, if the empty dimension also has an associated empty coordinate then it requires an extra step to get rid of it: In [21]: test_dataset = xr.Dataset(dict(
...: empty_array=xr.DataArray([], dims='a', coords={'a':[]}),
...: populated_array=xr.DataArray([1], {'b':['1']}, 'b')
...: ))
In [22]: test_dataset
Out[22]:
<xarray.Dataset>
Dimensions: (a: 0, b: 1)
Coordinates:
* a (a) float64
* b (b) <U1 '1'
Data variables:
empty_array (a) float64
populated_array (b) int32 1
In [23]: test_dataset.drop('empty_array')
Out[23]:
<xarray.Dataset>
Dimensions: (a: 0, b: 1)
Coordinates:
* a (a) float64
* b (b) <U1 '1'
Data variables:
populated_array (b) int32 1
In [24]: del test_dataset['a']
In [25]: test_dataset.drop('empty_array')
Out[25]:
<xarray.Dataset>
Dimensions: (b: 1)
Coordinates:
* b (b) <U1 '1'
Data variables:
populated_array (b) int32 1 Fixes seem reasonable, based on how we use xarray over at https://github.com/calliope-project/calliope/. The second one also provides more scope to remove subsets of data (all corresponding dims, coords, vars) if the dimension becomes superfluous for any reason, whether or not the dimension is empty. |
Yes, this was a useful feature that we lost. Note that in general we try to encourage using methods to create new Datasets rather than modifying existing ones inplace. So it might also make sense to add a |
So does this mean that the following line in the docs is now false: "If a dimension name is given as an argument to drop, it also drops all variables that use that dimension" This is at http://xarray.pydata.org/en/stable/data-structures.html#dataarray. It does not seem to work as advertised. Before drop:
Then I call
If I then try dropping Frequency again, it complains that there are no variables named 'Frequency'. So probably this issue should include an update to the documentation. Or maybe that should be a new issue. |
Oops -- yes, that line in the docs / example is broken! |
I was looking for a way to drop dimensions, similar to the OP, and found this issue. I created an implementation of |
I have a dataset that is produced programatically and can at times end up producing an empty DataArray. This isn't an issue, per-se, because I can remove those empty DataArrays later. However, I cannot find any way in which to remove the unused, empty dimension! I've tried deleting, dropping, resetting indeces, etc., and have had no luck in purging this empty dimension. It causes issues down the line as existence of entries in the dimensions list triggers certain events.
Is there a way to remove a dimension (and possibly then all data variables which depend on it)?
The text was updated successfully, but these errors were encountered: