Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix inheritance in DataTree.copy() #9457

Merged
merged 8 commits into from
Sep 9, 2024
Merged

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented Sep 8, 2024

Previously, we were copying parent coordinates/dimensions onto all child nodes. This is not obvious in the current repr, but you can see it from looking at the private ._node_coord_variables and ._node_dims.

To make the use of _to_dataset_view() little more obvious, I've added a required boolean inherited argument.

Fixes pydata#9454

Previously, we were copying parent coordinates/dimensions onto all
child nodes. This is not obvious in the current repr, but you can see
it from looking at the private `._node_coord_variables` and
`._node_dims`.

To make the use of `_to_dataset_view()` little more obvious, I've added
a required boolean `inherited` argument.
@TomNicholas TomNicholas added the topic-DataTree Related to the implementation of a DataTree class label Sep 9, 2024
Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @shoyer ! That was quite a subtle bug.

My review comments are just about clarifying code comments.

xarray/core/formatting.py Outdated Show resolved Hide resolved
xarray/core/datatree.py Outdated Show resolved Hide resolved
xarray/core/datatree.py Outdated Show resolved Hide resolved
@TomNicholas TomNicholas added the plan to merge Final call for comments label Sep 9, 2024
@shoyer shoyer merged commit 40666b2 into pydata:main Sep 9, 2024
34 checks passed
@shoyer shoyer deleted the copy-inheritance branch September 9, 2024 16:23
@shoyer
Copy link
Member Author

shoyer commented Sep 9, 2024

Thanks Tom for the comment suggestions!

dcherian added a commit to dcherian/xarray that referenced this pull request Sep 17, 2024
* main: (26 commits)
  Forbid modifying names of DataTree objects with parents (pydata#9494)
  DAS-2155 - Merge datatree documentation into main docs. (pydata#9033)
  Make illegal path-like variable names when constructing a DataTree from a Dataset (pydata#9378)
  Ensure TreeNode doesn't copy in-place (pydata#9482)
  `open_groups` for zarr backends (pydata#9469)
  Update pyproject.toml (pydata#9484)
  New whatsnew section (pydata#9483)
  Release notes for v2024.09.0 (pydata#9480)
  Fix `DataTree.coords.__setitem__` by adding `DataTreeCoordinates` class (pydata#9451)
  Rename DataTree's "ds" and "data" to "dataset" (pydata#9476)
  Update DataTree repr to indicate inheritance (pydata#9470)
  Bump pypa/gh-action-pypi-publish in the actions group (pydata#9460)
  Repo checker (pydata#9450)
  Add days_in_year and decimal_year to dt accessor (pydata#9105)
  remove parent argument from DataTree.__init__ (pydata#9465)
  Fix inheritance in DataTree.copy() (pydata#9457)
  Implement `DataTree.__delitem__` (pydata#9453)
  Add ASV for datatree.from_dict (pydata#9459)
  Make the first argument in DataTree.from_dict positional only (pydata#9446)
  Fix typos across the code, doc and comments (pydata#9443)
  ...
dcherian added a commit to dcherian/xarray that referenced this pull request Sep 17, 2024
* main: (29 commits)
  Release notes for v2024.09.0 (pydata#9480)
  Fix `DataTree.coords.__setitem__` by adding `DataTreeCoordinates` class (pydata#9451)
  Rename DataTree's "ds" and "data" to "dataset" (pydata#9476)
  Update DataTree repr to indicate inheritance (pydata#9470)
  Bump pypa/gh-action-pypi-publish in the actions group (pydata#9460)
  Repo checker (pydata#9450)
  Add days_in_year and decimal_year to dt accessor (pydata#9105)
  remove parent argument from DataTree.__init__ (pydata#9465)
  Fix inheritance in DataTree.copy() (pydata#9457)
  Implement `DataTree.__delitem__` (pydata#9453)
  Add ASV for datatree.from_dict (pydata#9459)
  Make the first argument in DataTree.from_dict positional only (pydata#9446)
  Fix typos across the code, doc and comments (pydata#9443)
  DataTree should not be "Generic" (pydata#9445)
  Disallow passing a DataArray as data into the DataTree constructor (pydata#9444)
  Support additional dtypes in `resample` (pydata#9413)
  Shallow copy parent and children in DataTree constructor (pydata#9297)
  Bump minimum versions for dependencies (pydata#9434)
  Always include at least one category in random test data (pydata#9436)
  Avoid deep-copy when constructing groupby codes (pydata#9429)
  ...
hollymandel pushed a commit to hollymandel/xarray that referenced this pull request Sep 23, 2024
* Fix inheritance in DataTree.copy()

Fixes pydata#9454

Previously, we were copying parent coordinates/dimensions onto all
child nodes. This is not obvious in the current repr, but you can see
it from looking at the private `._node_coord_variables` and
`._node_dims`.

To make the use of `_to_dataset_view()` little more obvious, I've added
a required boolean `inherited` argument.

* typing error

* add missing inherited argument

* Apply suggestions from code review

Co-authored-by: Tom Nicholas <tom@cworthy.org>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* tweaks to from_dict

* add issue link

---------

Co-authored-by: Tom Nicholas <tom@cworthy.org>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments topic-DataTree Related to the implementation of a DataTree class
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deep copying a datatree somehow duplicates data onto child nodes
2 participants