-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #642 Bad performance/memory leak for long list literals #643
Conversation
A one-character fix that speeds up the formatting of long literals and drastically reduces memory usage. A very simple example is ```elm module Small exposing (x) x = [ 1, 2, 3 , 4] ``` But with a list literal of many thousands of elements. The newline forces all elements to get their own line. This is implemented as a fold, but in practice this would often perform ``` longList ++ shortList ``` By reversing the fold (from `foldl` to `foldr`) that turns into ``` shortList ++ longList ``` Which requires less traversals and memory.
Thanks, and great catch! I'll be happy to merge this. One thing I'm curious about first... since writing the original code I have learned more about haskell, specifically that |
Usage of In this case, a There are 6 occurences of folkertdev@folkertdev ~/e/elm-format> grep "foldl" src/**/*.hs
src/ElmFormat/Render/Box.hs: |> foldl step (ReversedList.empty, ReversedList.empty)
src/ElmFormat/Render/Box.hs: foldl step (ReversedList.empty, Set.empty) input |> fst |> ReversedList.toList
src/ElmFormat/Render/Box.hs: case foldl stepChildren (ReversedList.empty, seen) next |> (\(a,b) -> (ReversedList.toList a, b)) of
src/ElmFormat/Upgrade_0_19.hs: (terms'', multiline'') = foldl step (ReversedList.empty, multiline) terms'
src/ElmFormat/Upgrade_0_19.hs: newBody = foldl (\e (name, value) -> mapExpr (inlineVar name (appMultiline == FASplitFirst) value) e) body mappings
src/Util/List.hs: List.foldl step init list where only one function from |
Just confirming that this change has some clear improvement for memory usage and no significant increase in memory usage or execution time for all of the existing test files: - total alloc = 3,798,304 bytes (excludes profiling overheads)
+ total alloc = 3,797,656 bytes (excludes profiling overheads)
- total alloc = 1,805,976 bytes (excludes profiling overheads)
+ total alloc = 1,803,360 bytes (excludes profiling overheads)
- total alloc = 2,668,800 bytes (excludes profiling overheads)
+ total alloc = 2,665,488 bytes (excludes profiling overheads)
- total alloc = 1,119,040 bytes (excludes profiling overheads)
+ total alloc = 1,119,344 bytes (excludes profiling overheads)
- total alloc = 1,254,960 bytes (excludes profiling overheads)
+ total alloc = 1,255,136 bytes (excludes profiling overheads)
- total alloc = 3,122,560 bytes (excludes profiling overheads)
+ total alloc = 3,118,200 bytes (excludes profiling overheads)
- total alloc = 2,501,000 bytes (excludes profiling overheads)
+ total alloc = 2,501,712 bytes (excludes profiling overheads)
- total alloc = 1,505,760 bytes (excludes profiling overheads)
+ total alloc = 1,506,376 bytes (excludes profiling overheads)
- total alloc = 1,450,184 bytes (excludes profiling overheads)
+ total alloc = 1,450,456 bytes (excludes profiling overheads)
- total alloc = 2,000,256 bytes (excludes profiling overheads)
+ total alloc = 2,000,872 bytes (excludes profiling overheads)
- total alloc = 1,140,888 bytes (excludes profiling overheads)
+ total alloc = 1,141,192 bytes (excludes profiling overheads)
- total alloc = 75,428,736 bytes (excludes profiling overheads)
+ total alloc = 74,520,992 bytes (excludes profiling overheads)
- total alloc = 15,000,728 bytes (excludes profiling overheads)
+ total alloc = 14,948,400 bytes (excludes profiling overheads)
- total alloc = 15,461,496 bytes (excludes profiling overheads)
+ total alloc = 15,380,240 bytes (excludes profiling overheads)
- total alloc = 17,956,168 bytes (excludes profiling overheads)
+ total alloc = 17,838,680 bytes (excludes profiling overheads)
- total alloc = 16,303,536 bytes (excludes profiling overheads)
+ total alloc = 16,213,856 bytes (excludes profiling overheads)
- total alloc = 12,749,080 bytes (excludes profiling overheads)
+ total alloc = 12,691,024 bytes (excludes profiling overheads)
- total alloc = 1,862,776 bytes (excludes profiling overheads)
+ total alloc = 1,863,784 bytes (excludes profiling overheads)
- total alloc = 3,782,888 bytes (excludes profiling overheads)
+ total alloc = 3,782,240 bytes (excludes profiling overheads)
- total alloc = 20,492,984 bytes (excludes profiling overheads)
+ total alloc = 20,390,720 bytes (excludes profiling overheads)
- total alloc = 59,016,856 bytes (excludes profiling overheads)
+ total alloc = 58,403,464 bytes (excludes profiling overheads)
- total alloc = 4,800,496 bytes (excludes profiling overheads)
+ total alloc = 4,778,616 bytes (excludes profiling overheads)
- total alloc = 13,300,336 bytes (excludes profiling overheads)
+ total alloc = 13,199,344 bytes (excludes profiling overheads)
- total alloc = 1,849,696 bytes (excludes profiling overheads)
+ total alloc = 1,850,704 bytes (excludes profiling overheads)
- total alloc = 4,940,128 bytes (excludes profiling overheads)
+ total alloc = 4,938,920 bytes (excludes profiling overheads)
- total alloc = 179,215,360 bytes (excludes profiling overheads)
+ total alloc = 176,197,448 bytes (excludes profiling overheads)
- total alloc = 1,154,984 bytes (excludes profiling overheads)
+ total alloc = 1,155,328 bytes (excludes profiling overheads)
- total alloc = 338,898,296 bytes (excludes profiling overheads)
+ total alloc = 327,299,256 bytes (excludes profiling overheads)
- total alloc = 3,798,608 bytes (excludes profiling overheads)
+ total alloc = 3,797,960 bytes (excludes profiling overheads)
- total alloc = 246,916,120 bytes (excludes profiling overheads)
+ total alloc = 236,894,176 bytes (excludes profiling overheads)
- total alloc = 17,267,944 bytes (excludes profiling overheads)
+ total alloc = 17,108,976 bytes (excludes profiling overheads)
- total alloc = 93,388,144 bytes (excludes profiling overheads)
+ total alloc = 90,377,024 bytes (excludes profiling overheads)
- total alloc = 5,128,864 bytes (excludes profiling overheads)
+ total alloc = 5,093,608 bytes (excludes profiling overheads)
- total alloc = 4,519,512 bytes (excludes profiling overheads)
+ total alloc = 4,515,408 bytes (excludes profiling overheads)
- total alloc = 9,944,424 bytes (excludes profiling overheads)
+ total alloc = 9,851,968 bytes (excludes profiling overheads)
- total alloc = 4,008,400 bytes (excludes profiling overheads)
+ total alloc = 3,996,072 bytes (excludes profiling overheads)
- total alloc = 3,122,864 bytes (excludes profiling overheads)
+ total alloc = 3,118,504 bytes (excludes profiling overheads)
- total alloc = 22,628,720 bytes (excludes profiling overheads)
+ total alloc = 22,482,256 bytes (excludes profiling overheads)
- total alloc = 4,162,992 bytes (excludes profiling overheads)
+ total alloc = 4,153,136 bytes (excludes profiling overheads)
- total alloc = 59,228,384 bytes (excludes profiling overheads)
+ total alloc = 58,636,272 bytes (excludes profiling overheads)
- total alloc = 1,176,456 bytes (excludes profiling overheads)
+ total alloc = 1,175,824 bytes (excludes profiling overheads)
- total alloc = 2,700,760 bytes (excludes profiling overheads)
+ total alloc = 2,701,304 bytes (excludes profiling overheads)
- total alloc = 5,342,768 bytes (excludes profiling overheads)
+ total alloc = 5,337,928 bytes (excludes profiling overheads)
- total alloc = 1,684,408 bytes (excludes profiling overheads)
+ total alloc = 1,683,496 bytes (excludes profiling overheads)
- total alloc = 15,260,640 bytes (excludes profiling overheads)
+ total alloc = 15,107,120 bytes (excludes profiling overheads)
- total alloc = 4,280,992 bytes (excludes profiling overheads)
+ total alloc = 4,263,376 bytes (excludes profiling overheads)
- total alloc = 4,254,520 bytes (excludes profiling overheads)
+ total alloc = 4,246,440 bytes (excludes profiling overheads)
- total alloc = 2,110,680 bytes (excludes profiling overheads)
+ total alloc = 2,112,152 bytes (excludes profiling overheads)
- total alloc = 13,427,632 bytes (excludes profiling overheads)
+ total alloc = 13,342,376 bytes (excludes profiling overheads)
- total alloc = 22,987,752 bytes (excludes profiling overheads)
+ total alloc = 22,450,168 bytes (excludes profiling overheads)
- total alloc = 28,399,568 bytes (excludes profiling overheads)
+ total alloc = 28,363,464 bytes (excludes profiling overheads)
- total alloc = 4,065,144 bytes (excludes profiling overheads)
+ total alloc = 4,051,304 bytes (excludes profiling overheads)
- total alloc = 4,222,616 bytes (excludes profiling overheads)
+ total alloc = 4,221,032 bytes (excludes profiling overheads)
- total alloc = 8,484,824 bytes (excludes profiling overheads)
+ total alloc = 8,433,752 bytes (excludes profiling overheads)
- total alloc = 33,422,344 bytes (excludes profiling overheads)
+ total alloc = 33,406,480 bytes (excludes profiling overheads)
- total alloc = 1,141,192 bytes (excludes profiling overheads)
+ total alloc = 1,141,496 bytes (excludes profiling overheads)
- total alloc = 1,863,080 bytes (excludes profiling overheads)
+ total alloc = 1,864,088 bytes (excludes profiling overheads)
- total alloc = 1,132,984 bytes (excludes profiling overheads)
+ total alloc = 1,133,288 bytes (excludes profiling overheads)
- total alloc = 53,508,584 bytes (excludes profiling overheads)
+ total alloc = 53,283,568 bytes (excludes profiling overheads)
- total alloc = 51,917,440 bytes (excludes profiling overheads)
+ total alloc = 51,728,104 bytes (excludes profiling overheads)
- total alloc = 33,764,280 bytes (excludes profiling overheads)
+ total alloc = 33,714,640 bytes (excludes profiling overheads)
- total alloc = 62,300,400 bytes (excludes profiling overheads)
+ total alloc = 62,032,168 bytes (excludes profiling overheads)
- total alloc = 19,602,248 bytes (excludes profiling overheads)
+ total alloc = 19,558,248 bytes (excludes profiling overheads) - total time = 0.01 secs (5 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (3 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (1 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (2 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (3 ticks @ 1000 us, 1 processor)
total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (3 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (2 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (3 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (2 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (5 ticks @ 1000 us, 1 processor)
total time = 0.00 secs (3 ticks @ 1000 us, 1 processor)
- total time = 0.07 secs (68 ticks @ 1000 us, 1 processor)
+ total time = 0.08 secs (81 ticks @ 1000 us, 1 processor)
- total time = 0.02 secs (21 ticks @ 1000 us, 1 processor)
+ total time = 0.02 secs (18 ticks @ 1000 us, 1 processor)
- total time = 0.02 secs (22 ticks @ 1000 us, 1 processor)
+ total time = 0.02 secs (17 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (15 ticks @ 1000 us, 1 processor)
+ total time = 0.03 secs (25 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (14 ticks @ 1000 us, 1 processor)
+ total time = 0.02 secs (20 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (13 ticks @ 1000 us, 1 processor)
+ total time = 0.02 secs (18 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (8 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (14 ticks @ 1000 us, 1 processor)
+ total time = 0.02 secs (16 ticks @ 1000 us, 1 processor)
- total time = 0.06 secs (55 ticks @ 1000 us, 1 processor)
+ total time = 0.04 secs (44 ticks @ 1000 us, 1 processor)
total time = 0.01 secs (8 ticks @ 1000 us, 1 processor)
- total time = 0.02 secs (18 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (15 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (3 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (9 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (7 ticks @ 1000 us, 1 processor)
- total time = 0.12 secs (123 ticks @ 1000 us, 1 processor)
+ total time = 0.12 secs (121 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (5 ticks @ 1000 us, 1 processor)
- total time = 0.29 secs (289 ticks @ 1000 us, 1 processor)
+ total time = 0.28 secs (277 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (8 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
- total time = 0.18 secs (182 ticks @ 1000 us, 1 processor)
+ total time = 0.17 secs (169 ticks @ 1000 us, 1 processor)
- total time = 0.02 secs (23 ticks @ 1000 us, 1 processor)
+ total time = 0.02 secs (21 ticks @ 1000 us, 1 processor)
- total time = 0.10 secs (95 ticks @ 1000 us, 1 processor)
+ total time = 0.09 secs (92 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (5 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (7 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (7 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (9 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (12 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (13 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (9 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (8 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
- total time = 0.02 secs (22 ticks @ 1000 us, 1 processor)
+ total time = 0.02 secs (21 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (3 ticks @ 1000 us, 1 processor)
- total time = 0.05 secs (51 ticks @ 1000 us, 1 processor)
+ total time = 0.06 secs (59 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (5 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (5 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (9 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (5 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (13 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (15 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (9 ticks @ 1000 us, 1 processor)
total time = 0.01 secs (7 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (11 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (13 ticks @ 1000 us, 1 processor)
- total time = 0.02 secs (21 ticks @ 1000 us, 1 processor)
+ total time = 0.03 secs (25 ticks @ 1000 us, 1 processor)
- total time = 0.02 secs (24 ticks @ 1000 us, 1 processor)
+ total time = 0.02 secs (23 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (6 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (5 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (8 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (7 ticks @ 1000 us, 1 processor)
- total time = 0.01 secs (12 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (13 ticks @ 1000 us, 1 processor)
- total time = 0.03 secs (26 ticks @ 1000 us, 1 processor)
+ total time = 0.02 secs (21 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (2 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (4 ticks @ 1000 us, 1 processor)
+ total time = 0.01 secs (7 ticks @ 1000 us, 1 processor)
- total time = 0.00 secs (2 ticks @ 1000 us, 1 processor)
+ total time = 0.00 secs (3 ticks @ 1000 us, 1 processor)
- total time = 0.05 secs (48 ticks @ 1000 us, 1 processor)
+ total time = 0.05 secs (46 ticks @ 1000 us, 1 processor)
- total time = 0.05 secs (52 ticks @ 1000 us, 1 processor)
+ total time = 0.04 secs (35 ticks @ 1000 us, 1 processor)
- total time = 0.03 secs (27 ticks @ 1000 us, 1 processor)
+ total time = 0.03 secs (34 ticks @ 1000 us, 1 processor)
- total time = 0.05 secs (52 ticks @ 1000 us, 1 processor)
+ total time = 0.07 secs (69 ticks @ 1000 us, 1 processor)
- total time = 0.02 secs (23 ticks @ 1000 us, 1 processor)
+ total time = 0.02 secs (22 ticks @ 1000 us, 1 processor) |
Thanks! this'll be merged via #646 where I started adding some scripts to compare performance between builds. |
Add profiling script and merge #643
Add profiling script and merge avh4#643
A one-character fix that speeds up the formatting of long literals
and drastically reduces memory usage.
A very simple example is
But with a list literal of many thousands of elements. The newline
forces all elements to get their own line. This is implemented as a
fold, but in practice this would often perform
By reversing the fold (from
foldl
tofoldr
) that turns intoWhich requires less traversals and memory.
There are test files and profiling outputs in the issue #642.