Make :m field of %Deep{} lazy #20

turion · 2021-06-17T09:49:29Z

Might solve #16. To be sure we would need some benchmarking though.

turion · 2021-06-17T14:51:22Z

Benchmark

##### With input medium sequence cons #####
Name             ips        average  deviation         median         99th %
concat    1633512.07      612.18 ns  ±6250.56%         390 ns        1278 ns
cons      1356663.21      737.10 ns  ±6075.50%         398 ns        1513 ns
snoc      1336867.93      748.02 ns  ±5680.85%         393 ns        1517 ns
view_l        211.35  4731583.58 ns    ±25.04%     4657417 ns  8711033.36 ns
view_r        121.69  8217508.18 ns    ±29.79%     7944275 ns 14897952.17 ns

Comparison: 
concat    1633512.07
cons      1356663.21 - 1.20x slower +124.92 ns
snoc      1336867.93 - 1.22x slower +135.84 ns
view_l        211.35 - 7729.10x slower +4730971.41 ns
view_r        121.69 - 13423.40x slower +8216896.00 ns

##### With input medium sequence snoc #####
Name             ips        average  deviation         median         99th %
concat    1658758.68      602.86 ns  ±6138.52%         384 ns        1286 ns
cons      1372052.70      728.83 ns  ±5888.26%         386 ns        1525 ns
snoc      1353326.46      738.92 ns  ±6297.00%         402 ns        1463 ns
view_r        209.92  4763735.19 ns    ±28.59%     4669014 ns 10267990.61 ns
view_l        126.15  7927187.21 ns    ±26.33%     7955136 ns 14288574.92 ns

Comparison: 
concat    1658758.68
cons      1372052.70 - 1.21x slower +125.97 ns
snoc      1353326.46 - 1.23x slower +136.06 ns
view_r        209.92 - 7901.89x slower +4763132.33 ns
view_l        126.15 - 13149.29x slower +7926584.35 ns

##### With input small sequence cons #####
Name             ips        average  deviation         median         99th %
concat    1565586.20      638.74 ns  ±6802.87%         381 ns        1455 ns
snoc      1529164.93      653.95 ns  ±5712.19%         380 ns        1477 ns
cons      1490111.15      671.09 ns  ±5350.95%         389 ns        1484 ns
view_l      26069.53    38358.96 ns   ±183.09%       27402 ns   299380.13 ns
view_r      17796.57    56190.61 ns   ±163.49%       39418 ns   546777.94 ns

Comparison: 
concat    1565586.20
snoc      1529164.93 - 1.02x slower +15.21 ns
cons      1490111.15 - 1.05x slower +32.35 ns
view_l      26069.53 - 60.05x slower +37720.22 ns
view_r      17796.57 - 87.97x slower +55551.87 ns

##### With input small sequence snoc #####
Name             ips        average  deviation         median         99th %
concat    1624972.73      615.39 ns  ±5881.09%         378 ns        1397 ns
cons      1564998.51      638.98 ns  ±5527.08%         377 ns        1459 ns
snoc      1490560.76      670.89 ns  ±5155.61%         397 ns        1478 ns
view_r      25830.92    38713.30 ns   ±193.76%       26927 ns   293055.80 ns
view_l      18655.21    53604.34 ns   ±156.64%       38502 ns   419600.48 ns

Comparison: 
concat    1624972.73
cons      1564998.51 - 1.04x slower +23.58 ns
snoc      1490560.76 - 1.09x slower +55.49 ns
view_r      25830.92 - 62.91x slower +38097.90 ns
view_l      18655.21 - 87.11x slower +52988.94 ns

##### With input tiny sequence cons #####
Name             ips        average  deviation         median         99th %
view_l    4509885.26      221.74 ns ±13771.53%         129 ns         587 ns
view_r    4269991.89      234.19 ns ±12514.81%         131 ns         625 ns
concat     828859.91     1206.48 ns  ±2536.01%         905 ns        3066 ns
cons       756566.86     1321.76 ns  ±2529.06%         880 ns        2626 ns
snoc       681658.40     1467.01 ns  ±2464.29%         912 ns        2881 ns

Comparison: 
view_l    4509885.26
view_r    4269991.89 - 1.06x slower +12.46 ns
concat     828859.91 - 5.44x slower +984.74 ns
cons       756566.86 - 5.96x slower +1100.03 ns
snoc       681658.40 - 6.62x slower +1245.28 ns

##### With input tiny sequence snoc #####
Name             ips        average  deviation         median         99th %
view_r    4516070.30      221.43 ns ±12445.88%         130 ns         583 ns
view_l    4469376.05      223.74 ns ±12223.51%         128 ns         619 ns
concat     802486.74     1246.13 ns  ±2320.99%         901 ns        2864 ns
snoc       712624.24     1403.26 ns  ±2583.45%         900 ns        2772 ns
cons       704427.85     1419.59 ns  ±2458.28%         891 ns        2925 ns

This suggests that I now have O(n log(n)) runtime for the view functions and O(1) for concat! Something is definitely wrong.

turion · 2021-06-17T15:33:29Z

Many functions are too lazy now. I need to add back some strictness in certain places. Just like Haskell performance debugging :D

thalesmg · 2021-06-17T23:11:20Z

lib/hallux/internal/finger_tree.ex

  def view_l(%Empty{}), do: nil
  def view_l(%Single{monoid: mo, x: x}), do: {x, %Empty{monoid: mo}}

  def view_l(%Deep{l: %One{a: x}, m: m, r: sf}),
-    do: {x, rot_l(m, sf)}
+    do: {x, rot_l(m.(), sf)}


Since m is being forced here, this is still as strict as before, right?

I guess that to achieve the O(1) in head/tail (when they are implemented) one would need to indeed return the thunk unevaluated, so that those functions would not force it, and one would need to force it to get a Tree back.

It looks like it'd have to be something like this for head/tail:

def view_l(%Deep{l: %One{a: x}, m: m, r: sf}), do: {x, fn -> rot_l(m, sf) end}

and force m inside each function as needed. So that head would be something like:

def head(t) do with {x, _thunk} <- view_l(t) do x end end

Makes sense, that simplifies the whole story a lot more.

thalesmg · 2021-06-17T23:29:57Z

Also, looking at the numbers from #22 , it seems that view_{l,r} became much slower?

Seq:

case	avg. before	avg. after
medium view_l (from snoc)	6771.66 ns	7927187.21 ns
small view_l (from snoc)	3746.34 ns	53604.34 ns
tiny view_l (from snoc)	220.01 ns	223.74 ns

Looks like they are doing more work?

turion · 2021-07-08T11:23:14Z

Also, looking at the numbers from #22 , it seems that view_{l,r} became much slower?

Looks like they are doing more work?

Yes, that's probably the overhead of the additional wrapping function.

turion force-pushed the dev_lazy_deep branch from ad459c7 to dff2d28 Compare June 17, 2021 14:40

Manuel Bärenz added 2 commits June 17, 2021 17:05

Add basic benchmark

714efb4

Make :m field of %Deep{} lazy

41db201

turion force-pushed the dev_lazy_deep branch from dff2d28 to 41db201 Compare June 17, 2021 15:07

turion marked this pull request as draft June 17, 2021 15:08

thalesmg reviewed Jun 17, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make :m field of %Deep{} lazy #20

Make :m field of %Deep{} lazy #20

turion commented Jun 17, 2021

turion commented Jun 17, 2021

turion commented Jun 17, 2021

thalesmg Jun 17, 2021 •

edited

Loading

turion Jul 8, 2021

thalesmg commented Jun 17, 2021 •

edited

Loading

turion commented Jul 8, 2021

Make :m field of %Deep{} lazy #20

Are you sure you want to change the base?

Make :m field of %Deep{} lazy #20

Conversation

turion commented Jun 17, 2021

turion commented Jun 17, 2021

Benchmark

turion commented Jun 17, 2021

thalesmg Jun 17, 2021 • edited Loading

Choose a reason for hiding this comment

turion Jul 8, 2021

Choose a reason for hiding this comment

thalesmg commented Jun 17, 2021 • edited Loading

turion commented Jul 8, 2021

thalesmg Jun 17, 2021 •

edited

Loading

thalesmg commented Jun 17, 2021 •

edited

Loading