Skip to content

Conversation

max-sixty
Copy link
Collaborator

@max-sixty max-sixty commented Sep 9, 2025

as discussed in #3891

Done with lots of help from Claude Code:


This change modifies xarray's default behavior to preserve attributes (attrs) across all operations, including computational, binary, and data manipulation functions. Previously, attributes were dropped by default unless keep_attrs=True was explicitly set. This new default aligns xarray with common scientific workflows where metadata preservation is crucial.

The keep_attrs option now defaults to True for most operations. For binary operations, attributes are preserved from the left-hand operand.

This is a breaking change

@github-actions github-actions bot added the topic-DataTree Related to the implementation of a DataTree class label Sep 9, 2025
@keewis
Copy link
Collaborator

keewis commented Sep 9, 2025

This is indeed a breaking change, which might require a deprecation cycle. Not sure, though, we could also decide that the benefits outweigh the disruption.

For binary operations, attributes are preserved from the left-hand operand.

should binary operations support combine_attrs strategy names / a function? apply_ufunc already supports that for keep_attrs (added in #5041), so it wouldn't be anything new for functions using it.

(doesn't have to be this PR, though)

@max-sixty
Copy link
Collaborator Author

This is indeed a breaking change, which might require a deprecation cycle. Not sure, though, we could also decide that the benefits outweigh the disruption.

it's difficult to do a deprecation cycle here — we would need to do something like "give a warning anytime we run an operation that drops an attrs" — could be quite noisy. some discussion on this in #3891. not impossible though...

I think it would probably be OK to start propagating more attrs by default as a breaking change. There's no easy way to roll this out incrementally, and I doubt too many users are relying upon metadata disappearing when they do xarray operations, given the somewhat inconsistent state of the current rules.

BREAKING CHANGE: Change keep_attrs default from False to True

This changes the default behavior of xarray operations to preserve
attributes by default, which better aligns with user expectations
and scientific workflows where metadata preservation is critical.

Migration guide:
- To restore previous behavior globally: xr.set_options(keep_attrs=False)
- To restore for specific operations: use keep_attrs=False parameter
- Alternative: use .drop_attrs() method after operations

Closes pydata#3891, pydata#4510, pydata#9920
max-sixty added a commit to max-sixty/xarray that referenced this pull request Sep 10, 2025
Integrated coordinate preservation feature from main with
our keep_attrs changes.
…alse

The merge incorrectly preserved coordinate attributes even when
keep_attrs=False. Now coordinates have their attrs cleared when
keep_attrs=False, consistent with data variables.
- When keep_attrs=True: restore attrs from original coords (func may have dropped them)
- When keep_attrs=False: clear all attrs
- More efficient than previous implementation
Group attribute operations by keep_attrs value for cleaner,
more readable code with identical functionality.
Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks on the right track, though Claude seems to be inserting a lot of specific comments about "now" that will get stale very quickly and should be removed.

keep_attrs = _get_keep_attrs(default=False)
keep_attrs = _get_keep_attrs(
default=True
) # Default now keeps attrs for reduction operations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment feels like that only makes sense in the context of this PR, not for people reading the code later.

return apply_ufunc(operator.add, a, b, keep_attrs=keep_attrs)
else:
return apply_ufunc(operator.add, a, b)
# Always explicitly pass keep_attrs to test the specific behavior
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

@@ -733,7 +731,7 @@ def add(a, b, keep_attrs):
pytest.param(
None,
[{"a": 1}, {"a": 2}, {"a": 3}],
{},
{"a": 1}, # apply_ufunc now keeps attrs by default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also delete

Comment on lines 5034 to +5035
expected0 = indarr[minindex]
expected0.attrs = self.attrs # Default now keeps attrs for reduction operations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indexing should preserve attrs, right? If so, I'm not sure why this is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-DataTree Related to the implementation of a DataTree class
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Keep attrs by default? (keep_attrs)
3 participants