Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make :m field of %Deep{} lazy #20

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Conversation

turion
Copy link
Contributor

@turion turion commented Jun 17, 2021

Might solve #16. To be sure we would need some benchmarking though.

@turion
Copy link
Contributor Author

turion commented Jun 17, 2021

Benchmark

##### With input medium sequence cons #####
Name             ips        average  deviation         median         99th %
concat    1633512.07      612.18 ns  ±6250.56%         390 ns        1278 ns
cons      1356663.21      737.10 ns  ±6075.50%         398 ns        1513 ns
snoc      1336867.93      748.02 ns  ±5680.85%         393 ns        1517 ns
view_l        211.35  4731583.58 ns    ±25.04%     4657417 ns  8711033.36 ns
view_r        121.69  8217508.18 ns    ±29.79%     7944275 ns 14897952.17 ns

Comparison: 
concat    1633512.07
cons      1356663.21 - 1.20x slower +124.92 ns
snoc      1336867.93 - 1.22x slower +135.84 ns
view_l        211.35 - 7729.10x slower +4730971.41 ns
view_r        121.69 - 13423.40x slower +8216896.00 ns

##### With input medium sequence snoc #####
Name             ips        average  deviation         median         99th %
concat    1658758.68      602.86 ns  ±6138.52%         384 ns        1286 ns
cons      1372052.70      728.83 ns  ±5888.26%         386 ns        1525 ns
snoc      1353326.46      738.92 ns  ±6297.00%         402 ns        1463 ns
view_r        209.92  4763735.19 ns    ±28.59%     4669014 ns 10267990.61 ns
view_l        126.15  7927187.21 ns    ±26.33%     7955136 ns 14288574.92 ns

Comparison: 
concat    1658758.68
cons      1372052.70 - 1.21x slower +125.97 ns
snoc      1353326.46 - 1.23x slower +136.06 ns
view_r        209.92 - 7901.89x slower +4763132.33 ns
view_l        126.15 - 13149.29x slower +7926584.35 ns

##### With input small sequence cons #####
Name             ips        average  deviation         median         99th %
concat    1565586.20      638.74 ns  ±6802.87%         381 ns        1455 ns
snoc      1529164.93      653.95 ns  ±5712.19%         380 ns        1477 ns
cons      1490111.15      671.09 ns  ±5350.95%         389 ns        1484 ns
view_l      26069.53    38358.96 ns   ±183.09%       27402 ns   299380.13 ns
view_r      17796.57    56190.61 ns   ±163.49%       39418 ns   546777.94 ns

Comparison: 
concat    1565586.20
snoc      1529164.93 - 1.02x slower +15.21 ns
cons      1490111.15 - 1.05x slower +32.35 ns
view_l      26069.53 - 60.05x slower +37720.22 ns
view_r      17796.57 - 87.97x slower +55551.87 ns

##### With input small sequence snoc #####
Name             ips        average  deviation         median         99th %
concat    1624972.73      615.39 ns  ±5881.09%         378 ns        1397 ns
cons      1564998.51      638.98 ns  ±5527.08%         377 ns        1459 ns
snoc      1490560.76      670.89 ns  ±5155.61%         397 ns        1478 ns
view_r      25830.92    38713.30 ns   ±193.76%       26927 ns   293055.80 ns
view_l      18655.21    53604.34 ns   ±156.64%       38502 ns   419600.48 ns

Comparison: 
concat    1624972.73
cons      1564998.51 - 1.04x slower +23.58 ns
snoc      1490560.76 - 1.09x slower +55.49 ns
view_r      25830.92 - 62.91x slower +38097.90 ns
view_l      18655.21 - 87.11x slower +52988.94 ns

##### With input tiny sequence cons #####
Name             ips        average  deviation         median         99th %
view_l    4509885.26      221.74 ns ±13771.53%         129 ns         587 ns
view_r    4269991.89      234.19 ns ±12514.81%         131 ns         625 ns
concat     828859.91     1206.48 ns  ±2536.01%         905 ns        3066 ns
cons       756566.86     1321.76 ns  ±2529.06%         880 ns        2626 ns
snoc       681658.40     1467.01 ns  ±2464.29%         912 ns        2881 ns

Comparison: 
view_l    4509885.26
view_r    4269991.89 - 1.06x slower +12.46 ns
concat     828859.91 - 5.44x slower +984.74 ns
cons       756566.86 - 5.96x slower +1100.03 ns
snoc       681658.40 - 6.62x slower +1245.28 ns

##### With input tiny sequence snoc #####
Name             ips        average  deviation         median         99th %
view_r    4516070.30      221.43 ns ±12445.88%         130 ns         583 ns
view_l    4469376.05      223.74 ns ±12223.51%         128 ns         619 ns
concat     802486.74     1246.13 ns  ±2320.99%         901 ns        2864 ns
snoc       712624.24     1403.26 ns  ±2583.45%         900 ns        2772 ns
cons       704427.85     1419.59 ns  ±2458.28%         891 ns        2925 ns

This suggests that I now have O(n log(n)) runtime for the view functions and O(1) for concat! Something is definitely wrong.

@turion turion marked this pull request as draft June 17, 2021 15:08
@turion
Copy link
Contributor Author

turion commented Jun 17, 2021

Many functions are too lazy now. I need to add back some strictness in certain places. Just like Haskell performance debugging :D

def view_l(%Empty{}), do: nil
def view_l(%Single{monoid: mo, x: x}), do: {x, %Empty{monoid: mo}}

def view_l(%Deep{l: %One{a: x}, m: m, r: sf}),
do: {x, rot_l(m, sf)}
do: {x, rot_l(m.(), sf)}
Copy link
Owner

@thalesmg thalesmg Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since m is being forced here, this is still as strict as before, right?

I guess that to achieve the O(1) in head/tail (when they are implemented) one would need to indeed return the thunk unevaluated, so that those functions would not force it, and one would need to force it to get a Tree back.

It looks like it'd have to be something like this for head/tail:

def view_l(%Deep{l: %One{a: x}, m: m, r: sf}),
    do: {x, fn -> rot_l(m, sf) end}

and force m inside each function as needed. So that head would be something like:

def head(t) do
  with {x, _thunk} <- view_l(t) do
    x
  end
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, that simplifies the whole story a lot more.

@thalesmg
Copy link
Owner

thalesmg commented Jun 17, 2021

Also, looking at the numbers from #22 , it seems that view_{l,r} became much slower?

Seq:

case avg. before avg. after
medium view_l (from snoc) 6771.66 ns 7927187.21 ns
small view_l (from snoc) 3746.34 ns 53604.34 ns
tiny view_l (from snoc) 220.01 ns 223.74 ns

Looks like they are doing more work?

@turion
Copy link
Contributor Author

turion commented Jul 8, 2021

Also, looking at the numbers from #22 , it seems that view_{l,r} became much slower?

Looks like they are doing more work?

Yes, that's probably the overhead of the additional wrapping function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants