You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unfortunately foldTree does not optimize as well it could when b is a function.
An an example, consider that we want to calculate the sum of depths of nodes in a tree.
We can write a recursive function manually:
depthSum_rec::Treea->Int
depthSum_rec t = go t 00where
go (Node _ ts) depth acc = foldl' (\acc' t' -> go t' (depth+1) acc') (acc + depth) ts
-- depthSum_rec: OK (0.18s)-- 5.18 ms ± 508 μs
Now let's use foldTree:
depthSum_foldTree::Treea->Int
depthSum_foldTree t = foldTree f t 00where
f _ ks depth acc = foldl' (\acc' k -> k (depth+1) acc') (acc + depth) ks
-- depthSum_foldTree: OK (0.34s)-- 43.6 ms ± 3.2 ms
That's a lot worse! The problem is that the list of partially applied functions [b] is manifested, see GHC#23319. According to SPJ this can't be easily improved.
Consider a different fold function which also folds over the [b] without creating it:
foldTree2:: (a->b->c) -> (c->b->b) ->b->Treea->c
foldTree2 f c z = go where go (Node x ts) = f x (foldr (c . go) z ts)
Now we can write:
depthSum_foldTree2::Treea->Int
depthSum_foldTree2 t = foldTree2 f f' (constid) t 00where
f _ k depth acc = k depth (acc + depth)
f' k1 k2 depth acc = k2 depth (k1 (depth+1) acc)
-- depthSum_foldTree2: OK (0.23s)-- 5.16 ms ± 376 μs
As good as depthSum_rec! Could we have foldTree2 (perhaps with a better name) in Data.Tree?
The type of the function gives me no clue what it does. That makes me a bit suspicious. Can you write documentation that makes it easy for people to think about? What benefit does this have over doing the folding by hand?
The type of the function gives me no clue what it does.
It is just the replacement of all the constructors involved in a Tree. So foldTree2 Node (:) [] = id.
What benefit does this have over doing the folding by hand?
It lets us avoid writing a recursive function, which often ends up shorter or simpler. Another benefit is that it could participate in fold/build fusion. I have been thinking about this a bit, but perhaps it deserves a separate issue.
We have in
Data.Tree
Unfortunately
foldTree
does not optimize as well it could whenb
is a function.An an example, consider that we want to calculate the sum of depths of nodes in a tree.
We can write a recursive function manually:
Now let's use
foldTree
:That's a lot worse! The problem is that the list of partially applied functions
[b]
is manifested, see GHC#23319. According to SPJ this can't be easily improved.Consider a different fold function which also folds over the
[b]
without creating it:Now we can write:
As good as
depthSum_rec
! Could we havefoldTree2
(perhaps with a better name) inData.Tree
?The benchmark setup, for completeness
The text was updated successfully, but these errors were encountered: