You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the implementation of fold, scan, mapAccum and their derivatives settles down, let's think about parallel execution. A GPU backend to actually benchmark the result wouldn't hurt either. Tom provides
At the time of writing, everything is a special case of mapAccumL or mapAccumR, but it may very well make sense to re-introduce a standalone fold just to get a better parallelism (overriding a default method may be a good start) and then we may want to generalize the type of the fold method to any heterogeneous vectors of tensors, which by that time may get easier to use than currently (via typing, adaptors or otherwise). If if turns out scan is easier to parallelise than mapAccum (unlikely?), we may also recover the derivative expressed in terms of scan and build instead of in terms of mapAccum, at the cost of applying the derivative of the little function twice (details in old commits).
The text was updated successfully, but these errors were encountered:
Mikolaj
changed the title
This about parallel execution of our mapAccums and their derivatives (associative operations?)
Think about parallel execution of our mapAccums and their derivatives (associative operations?)
Feb 26, 2024
When the implementation of fold, scan, mapAccum and their derivatives settles down, let's think about parallel execution. A GPU backend to actually benchmark the result wouldn't hurt either. Tom provides
https://jax.readthedocs.io/en/latest/_autosummary/jax.lax.associative_scan.html
At the time of writing, everything is a special case of mapAccumL or mapAccumR, but it may very well make sense to re-introduce a standalone fold just to get a better parallelism (overriding a default method may be a good start) and then we may want to generalize the type of the fold method to any heterogeneous vectors of tensors, which by that time may get easier to use than currently (via typing, adaptors or otherwise). If if turns out scan is easier to parallelise than mapAccum (unlikely?), we may also recover the derivative expressed in terms of scan and build instead of in terms of mapAccum, at the cost of applying the derivative of the little function twice (details in old commits).
The text was updated successfully, but these errors were encountered: