Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unvetted DataTree methods #9585

Merged
merged 8 commits into from
Oct 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
297 changes: 149 additions & 148 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -705,16 +705,16 @@ Pathlib-like Interface
DataTree.parents
DataTree.relative_to

Missing:
.. Missing:

..
.. ..

``DataTree.glob``
``DataTree.joinpath``
``DataTree.with_name``
``DataTree.walk``
``DataTree.rename``
``DataTree.replace``
.. ``DataTree.glob``
.. ``DataTree.joinpath``
.. ``DataTree.with_name``
.. ``DataTree.walk``
.. ``DataTree.rename``
.. ``DataTree.replace``

DataTree Contents
-----------------
Expand All @@ -725,17 +725,18 @@ Manipulate the contents of all nodes in a ``DataTree`` simultaneously.
:toctree: generated/

DataTree.copy
DataTree.assign_coords
DataTree.merge
DataTree.rename
DataTree.rename_vars
DataTree.rename_dims
DataTree.swap_dims
DataTree.expand_dims
DataTree.drop_vars
DataTree.drop_dims
DataTree.set_coords
DataTree.reset_coords

.. DataTree.assign_coords
.. DataTree.merge
.. DataTree.rename
.. DataTree.rename_vars
.. DataTree.rename_dims
.. DataTree.swap_dims
.. DataTree.expand_dims
.. DataTree.drop_vars
.. DataTree.drop_dims
.. DataTree.set_coords
.. DataTree.reset_coords
Comment on lines +728 to +739
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the decision here that these methods should act locally rather than mapping over the subtree? And the distinction being between methods which manipulate structure versus methods which manipulate data - the latter map over the whole tree?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had not thought about how these methods should work, I was just removing them from the docs because they no longer have an implementation!

That said, my inclination is that mapping over the subtree is probably appropriate for all of these. But many will probably need some special handling so they don't raise errors about missing variables/dimensions.


DataTree Node Contents
----------------------
Expand All @@ -760,129 +761,129 @@ Compare one ``DataTree`` object to another.
DataTree.equals
DataTree.identical

Indexing
--------

Index into all nodes in the subtree simultaneously.

.. autosummary::
:toctree: generated/

DataTree.isel
DataTree.sel
DataTree.drop_sel
DataTree.drop_isel
DataTree.head
DataTree.tail
DataTree.thin
DataTree.squeeze
DataTree.interp
DataTree.interp_like
DataTree.reindex
DataTree.reindex_like
DataTree.set_index
DataTree.reset_index
DataTree.reorder_levels
DataTree.query

..

Missing:
``DataTree.loc``


Missing Value Handling
----------------------

.. autosummary::
:toctree: generated/

DataTree.isnull
DataTree.notnull
DataTree.combine_first
DataTree.dropna
DataTree.fillna
DataTree.ffill
DataTree.bfill
DataTree.interpolate_na
DataTree.where
DataTree.isin

Computation
-----------

Apply a computation to the data in all nodes in the subtree simultaneously.

.. autosummary::
:toctree: generated/

DataTree.map
DataTree.reduce
DataTree.diff
DataTree.quantile
DataTree.differentiate
DataTree.integrate
DataTree.map_blocks
DataTree.polyfit
DataTree.curvefit

Aggregation
-----------

Aggregate data in all nodes in the subtree simultaneously.

.. autosummary::
:toctree: generated/

DataTree.all
DataTree.any
DataTree.argmax
DataTree.argmin
DataTree.idxmax
DataTree.idxmin
DataTree.max
DataTree.min
DataTree.mean
DataTree.median
DataTree.prod
DataTree.sum
DataTree.std
DataTree.var
DataTree.cumsum
DataTree.cumprod

ndarray methods
---------------

Methods copied from :py:class:`numpy.ndarray` objects, here applying to the data in all nodes in the subtree.

.. autosummary::
:toctree: generated/

DataTree.argsort
DataTree.astype
DataTree.clip
DataTree.conj
DataTree.conjugate
DataTree.round
DataTree.rank

Reshaping and reorganising
--------------------------

Reshape or reorganise the data in all nodes in the subtree.

.. autosummary::
:toctree: generated/

DataTree.transpose
DataTree.stack
DataTree.unstack
DataTree.shift
DataTree.roll
DataTree.pad
DataTree.sortby
DataTree.broadcast_like
.. Indexing
.. --------

.. Index into all nodes in the subtree simultaneously.

.. .. autosummary::
.. :toctree: generated/

.. DataTree.isel
.. DataTree.sel
.. DataTree.drop_sel
.. DataTree.drop_isel
.. DataTree.head
.. DataTree.tail
.. DataTree.thin
.. DataTree.squeeze
.. DataTree.interp
.. DataTree.interp_like
.. DataTree.reindex
.. DataTree.reindex_like
.. DataTree.set_index
.. DataTree.reset_index
Comment on lines +782 to +785
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear to me whether these methods would count as "manipulating structure" or "manipulating data"...

.. DataTree.reorder_levels
.. DataTree.query

.. ..

.. Missing:
.. ``DataTree.loc``
shoyer marked this conversation as resolved.
Show resolved Hide resolved


.. Missing Value Handling
.. ----------------------

.. .. autosummary::
.. :toctree: generated/

.. DataTree.isnull
.. DataTree.notnull
.. DataTree.combine_first
.. DataTree.dropna
.. DataTree.fillna
.. DataTree.ffill
.. DataTree.bfill
.. DataTree.interpolate_na
.. DataTree.where
.. DataTree.isin

.. Computation
.. -----------

.. Apply a computation to the data in all nodes in the subtree simultaneously.

.. .. autosummary::
.. :toctree: generated/

.. DataTree.map
.. DataTree.reduce
.. DataTree.diff
.. DataTree.quantile
.. DataTree.differentiate
.. DataTree.integrate
.. DataTree.map_blocks
.. DataTree.polyfit
.. DataTree.curvefit

.. Aggregation
.. -----------

.. Aggregate data in all nodes in the subtree simultaneously.

.. .. autosummary::
.. :toctree: generated/

.. DataTree.all
.. DataTree.any
.. DataTree.argmax
.. DataTree.argmin
.. DataTree.idxmax
.. DataTree.idxmin
.. DataTree.max
.. DataTree.min
.. DataTree.mean
.. DataTree.median
.. DataTree.prod
.. DataTree.sum
.. DataTree.std
.. DataTree.var
.. DataTree.cumsum
.. DataTree.cumprod

.. ndarray methods
.. ---------------

.. Methods copied from :py:class:`numpy.ndarray` objects, here applying to the data in all nodes in the subtree.

.. .. autosummary::
.. :toctree: generated/

.. DataTree.argsort
.. DataTree.astype
.. DataTree.clip
.. DataTree.conj
.. DataTree.conjugate
.. DataTree.round
.. DataTree.rank

.. Reshaping and reorganising
.. --------------------------

.. Reshape or reorganise the data in all nodes in the subtree.

.. .. autosummary::
.. :toctree: generated/

.. DataTree.transpose
.. DataTree.stack
.. DataTree.unstack
.. DataTree.shift
.. DataTree.roll
.. DataTree.pad
.. DataTree.sortby
.. DataTree.broadcast_like

IO / Conversion
===============
Expand Down Expand Up @@ -961,10 +962,10 @@ DataTree methods
DataTree.to_netcdf
DataTree.to_zarr

..
.. ..

Missing:
``open_mfdatatree``
.. Missing:
.. ``open_mfdatatree``

Coordinates objects
===================
Expand Down Expand Up @@ -1476,10 +1477,10 @@ Advanced API
backends.list_engines
backends.refresh_engines

..
.. ..

Missing:
``DataTree.set_close``
.. Missing:
.. ``DataTree.set_close``

Default, pandas-backed indexes built-in Xarray:

Expand Down
26 changes: 16 additions & 10 deletions doc/getting-started-guide/quick-overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -314,23 +314,29 @@ And you can get a copy of just the node local values of :py:class:`~xarray.Datas
ds_node_local = dt["simulation/coarse"].to_dataset(inherited=False)
ds_node_local

Operations map over subtrees, so we can take a mean over the ``x`` dimension of both the ``fine`` and ``coarse`` groups just by:
.. note::

.. ipython:: python
We intend to eventually implement most :py:class:`~xarray.Dataset` methods
(indexing, aggregation, arithmetic, etc) on :py:class:`~xarray.DataTree`
objects, but many methods have not been implemented yet.

avg = dt["simulation"].mean(dim="x")
avg
.. Operations map over subtrees, so we can take a mean over the ``x`` dimension of both the ``fine`` and ``coarse`` groups just by:

Here the ``"x"`` dimension used is always the one local to that subgroup.
.. .. ipython:: python

.. avg = dt["simulation"].mean(dim="x")
.. avg

You can do almost everything you can do with :py:class:`~xarray.Dataset` objects with :py:class:`~xarray.DataTree` objects
(including indexing and arithmetic), as operations will be mapped over every subgroup in the tree.
This allows you to work with multiple groups of non-alignable variables at once.
.. Here the ``"x"`` dimension used is always the one local to that subgroup.

.. note::

If all of your variables are mutually alignable (i.e. they live on the same
.. You can do almost everything you can do with :py:class:`~xarray.Dataset` objects with :py:class:`~xarray.DataTree` objects
.. (including indexing and arithmetic), as operations will be mapped over every subgroup in the tree.
.. This allows you to work with multiple groups of non-alignable variables at once.

.. tip::

If all of your variables are mutually alignable (i.e., they live on the same
grid, such that every common dimension name maps to the same length), then
you probably don't need :py:class:`xarray.DataTree`, and should consider
just sticking with :py:class:`xarray.Dataset`.
Loading
Loading