-
Notifications
You must be signed in to change notification settings - Fork 41
Bug with arithmetic between datasets and datatrees (e.g. ds * dt) #146
Comments
Alrighty, I had 1h spare to look into this. I think I can get a fix going for this, at the Datatree end. Here's a quick precis:
My gut instinct is that by targeting this bit of code, it's possible to trigger a NotImplemented error, which should propagate back through the call stack to the invocation of Quick class diagram: classDiagram
Dataset <|-- DatasetArithmetic
DatasetArithmetic <|-- SupportsArithmetic
DatasetArithmetic <|-- DatasetOpsMixin
class Dataset {
_binary_op(Any other, Operator f, bool reflexive=False, Optional[Any] join) -> Dataset:
}
class DatasetOpsMixin {
_binary_op(Any other, Operator f, bool reflexive=False): Result[Operator, [Self, Other]]
__mul__(Any other, Any other): Result[operator.mul]
}
class SupportsArithmetic {
tuple _HANDLED_TYPES
__array_ufunc__(Ufunc ufunc, str method, *inputs, **kwargs): Result[Ufunc, ArrayUfuncInputs[inputs, kwargs]]]
}
Datatree .. Dataset
|
I'll give it a try now, but I think a simple bit of switch-y stuff with |
Oh cool, thanks for looking into this @benjaminwoods !
So you're suggesting making
Worth remembering that if this turns out to be too tricky to fix here we can just punt on it until datatree is integrated upstream in xarray, which is the ultimate goal. |
No worries!
Yep, basically. Unfortunately, I don't think it's currently doable, after a closer look. Here's a quick example, with the current call stack: import xarray as xr
from datatree import DataTree
ds = xr.Dataset({"a": 1})
dt = DataTree(data=ds)
ds * dt Stack when executing
I think this needs fixing at the
I'm also not a fan of that second idea as this is essentially building a contract with internal code in an internal function. A true nightmare to attempt to maintain stability there, as an developer working on |
Closed in favour of pydata/xarray#9365 |
Arithmetic involving one
DataTree
and oneDataset
is supposed to be node-wise, i.e. the operation involving the singleDataset
is automatically applied to every node of the tree individually, returning a new tree, like this:However there is a bad bug:
This didn't return a tree, so arithmetic between Datasets and DataTrees currently doesn't respect commutativity :(
Interestingly this does work fine for python scalars
and for numpy arrays
I do have tests for this arithmetic behaviour but it looks like I didn't think to check commutativity in any of those tests!
I haven't looked into this deeply yet but my hunch is that it's something to do with
__mul__
/__rmul__
onDataset
competing for priority with__mul__
/__rmul__
onDataTree
. If that is the case then it might only be possible to fix it upstream in xarray by changing the behaviour ofDataset
to defer toDataTree
...The text was updated successfully, but these errors were encountered: