You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 24, 2024. It is now read-only.
I realised that part of the reason that arithmetic (#24) and ufuncs (#25) don't yet work is because the map_over_subtree decorator currently only maps over a single subtree.
This works fine for mapping unary functions such as .isel, because they only accept one tree-like argument (i.e. self for the .isel method). However for any type of binary function such as add(dt1, dt2) then pairs of respective nodes in each tree need to be operated on together, as result_ds = add(dt1[node].ds, dt2[node].ds), before the output tree is built up from the results.
In the most general case we need to be able to map functions like
deffunc(*args, **kwargs)
# do stuff involving multiple Dataset objectsreturnoutput_trees
where any number of the args and kwargs could be DataTrees, and output_trees could be a list of any number of DataTrees.
To implement this the map_over_subtree decorator has to become a lot more general. It needs to
Identify which of args and kwargs are DataTree objects,
Check that all of those trees are isomorphic to one another, (EDIT: this was implemented in Check isomorphism #31)
Walk along the nodes of all N trees simultaneously,
Pass the respective N nodes from that position in each tree to func, as Datasets, without losing their position in *args, **kwargs,
Use the M output Datasets from func to rebuild M DataTree objects (which all have the same structure as the input trees), and return them.
We therefore have to decide what we mean by "isomorphic". The strictest definition would be that all node names are the same, so that
dt_1:
DataNode('foo')
| Data A
+---DataNode('bar')
+ Data B
could be mapped alongside
dt_2:
DataNode('foo')
| Data C
+---DataNode('bar')
+ Data D
but not alongside
dt_3:
DataNode('baz')
| Data C
+---DataNode('woz')
+ Data D
A more lenient definition would be that each node's ordered set of children must each have the same number of children as it's counterpart in the other tree. (In other words the tree structure must be the same, but the node names need not be. This requires the children to be ordered to avoid ambiguities.) This definition would allow dt_3 to be mapped over alongside dt_1 or dt_2 (or both simultaneously for a func that accepts 3 Dataset arguments).
The text was updated successfully, but these errors were encountered:
I realised that part of the reason that arithmetic (#24) and ufuncs (#25) don't yet work is because the
map_over_subtree
decorator currently only maps over a single subtree.This works fine for mapping unary functions such as
.isel
, because they only accept one tree-like argument (i.e.self
for the.isel
method). However for any type of binary function such asadd(dt1, dt2)
then pairs of respective nodes in each tree need to be operated on together, asresult_ds = add(dt1[node].ds, dt2[node].ds)
, before the output tree is built up from the results.In the most general case we need to be able to map functions like
where any number of the args and kwargs could be DataTrees, and
output_trees
could be a list of any number of DataTrees.To implement this the
map_over_subtree
decorator has to become a lot more general. It needs toargs
andkwargs
are DataTree objects,func
, as Datasets, without losing their position in*args
,**kwargs
,func
to rebuild M DataTree objects (which all have the same structure as the input trees), and return them.We therefore have to decide what we mean by "isomorphic". The strictest definition would be that all node names are the same, so that
could be mapped alongside
but not alongside
A more lenient definition would be that each node's ordered set of children must each have the same number of children as it's counterpart in the other tree. (In other words the tree structure must be the same, but the node names need not be. This requires the children to be ordered to avoid ambiguities.) This definition would allow
dt_3
to be mapped over alongsidedt_1
ordt_2
(or both simultaneously for afunc
that accepts 3 Dataset arguments).The text was updated successfully, but these errors were encountered: