-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add var and std to weighted computations #5870
Add var and std to weighted computations #5870
Conversation
Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com>
|
||
da = DataArray([1, 2]) | ||
weights = DataArray(weights) | ||
result = da.weighted(weights).sum_of_squares() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests look quite similar to each other. Would the tests still be readable if we parametrize the function as well? Something like:
func = "sum_of_squares"
result = getattr(da.weighted(weights), func)()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the tests are all very similar, but I didn't really see a way to generalize the set of already existing ones, apart from extending it by following the same pattern (with nan, without, bool and equals weights). I thought of doing something like you suggest, but the potential problem is that it would make them harder to understand, and also I am of the opinion that the tests are a domain where the DRY principle is less important and applicable. That said, I am very open to revisiting the option, if you feel it would be better in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree there's great value in them being simple and easy to read. I'm just feeling tldr-symptoms with these now but if you don't think it'll improve readability then that's fine.
Looks good - nice first xarray PR! One thing you could add is "degrees of freedom" for |
Is there an appetite to add a weighted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cjauvin let us know if you want to add anything to this PR, else I am happy to merge this as is before the next release.
btw - I wrote these all these not-so-readable tests. They are here to test all kinds of edge cases. I agree it could be good to simplify them but that could be another PR. I see two ways (1) don't care about the result but only if there is a number or NaN at a certain position (2) use an external library to check the results (although am I not sure there is a library that handles all these weighted operations with NaN).
Thanks for the feedback @mathause! If you are ok with postponing the addition of an extra |
Thanks @cjauvin I see this is your first PR. It's a great one. Welcome to xarray! |
* upstream/main: Add var and std to weighted computations (pydata#5870)
* main: Add typing_extensions as a required dependency (pydata#5911) pydata#5740 follow up: supress xr.ufunc warnings in tests (pydata#5914) Avoid accessing slow .data in unstack (pydata#5906) Add wradlib to ecosystem in docs (pydata#5915) Use .to_numpy() for quantified facetgrids (pydata#5886) [test-upstream] fix pd skipna=None (pydata#5899) Add var and std to weighted computations (pydata#5870) Check for path-like objects rather than Path type, use os.fspath (pydata#5879) Handle single `PathLike` objects in `open_mfdataset()` (pydata#5884)
* upstream/main: Add typing_extensions as a required dependency (pydata#5911) pydata#5740 follow up: supress xr.ufunc warnings in tests (pydata#5914) Avoid accessing slow .data in unstack (pydata#5906) Add wradlib to ecosystem in docs (pydata#5915) Use .to_numpy() for quantified facetgrids (pydata#5886) [test-upstream] fix pd skipna=None (pydata#5899) Add var and std to weighted computations (pydata#5870)
I would be strongly interested in this capability! |
Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com>
whats-new.rst
api.rst
This follows #2922 to add
var
,std
andsum_of_squares
toDataArray.weighted
andDataset.weighted
. I would also like to add weighted quantile, eventually.