Fix O(n) check making accessing many elements O(n**2) #54

cben · 2017-06-21T15:02:29Z

Primed by seeing @jrafanie's O(N**2) fix #52,
I was byebug-ing through my code into new_ostruct_member and noticed another O(n**2):

self.methods.include?(...)  # .methods returns an array!

I say small n because it's number of direct subelements not whole tree size.
However I'm a bit shocked that it makes even tiny 1-element ROS 2.5 times faster(!) — probably because .methods includes 85 inherited methods even before new_ostruct_member adds any.
(note methods(false) below being fast for small hash but still O(n**2) for large hash).
This is perhaps why #52 did not see such a big improvement...

Microbenchmark (using same data @jrafanie used, large hash has 579 keys, ends up with 1737 methods):
https://github.com/cben/recursive-open-struct/blob/benchmark-methods-include/benchmark.rb
Essentially creates ROS from a hash, then 5 times doing .foo read access on all fields.

RUBY_VERSION: 2.4.1   # Ruby 2.3 ratios were similar

Comparison:
Small hash - methods(false).include?:    41199.2 i/s
Small hash - respond_to?:    38537.2 i/s - same-ish: difference falls within error
Small hash - method_defined?:    38314.1 i/s - same-ish: difference falls within error
Small hash - @sub_elements.include? (BUG?):    37701.9 i/s - same-ish: difference falls within error
Small hash - false (BUG?):    35085.9 i/s - same-ish: difference falls within error
Small hash - RecursiveOpenStruct methods.include?:    17507.2 i/s - 2.35x  slower
Small hash - subclassed methods.include?:    16578.2 i/s - 2.49x  slower

Comparison:
Large hash (strings) - @sub_elements.include? (BUG?):       99.3 i/s
Large hash (strings) - respond_to?:       93.5 i/s - same-ish: difference falls within error
Large hash (strings) - false (BUG?):       80.3 i/s - same-ish: difference falls within error
Large hash (strings) - method_defined?:       79.8 i/s - same-ish: difference falls within error
Large hash (strings) - methods(false).include?:       14.2 i/s - 7.00x  slower
Large hash (strings) - RecursiveOpenStruct methods.include?:       12.5 i/s - 7.97x  slower
Large hash (strings) - subclassed methods.include?:       12.2 i/s - 8.12x  slower

Comparison:
Large hash (symbols) - false (BUG?):      111.4 i/s
Large hash (symbols) - method_defined?:      106.0 i/s - same-ish: difference falls within error
Large hash (symbols) - respond_to?:       98.1 i/s - same-ish: difference falls within error
Large hash (symbols) - @sub_elements.include? (BUG?):       96.0 i/s - same-ish: difference falls within error
Large hash (symbols) - methods(false).include?:       14.8 i/s - 7.54x  slower
Large hash (symbols) - RecursiveOpenStruct methods.include?:       12.1 i/s - 9.22x  slower
Large hash (symbols) - subclassed methods.include?:        9.4 i/s - 11.86x  slower

This includes several ways I've tried to do this check, some buggy approximations.
I'm not sure this check it's needed at all (false is without) — AFAICT new_ostruct_member is only called from method_missing so can assume we don't have the method yet — but method_defined? is negligibly cheap.

I have some more ideas (though not enough time) for RecursiveOpenStruct and/or OpenStruct performance. Are there any benchmarks you use? cc @sferik

I even want to challenge the creation of methods at all vs just going through method_missing every time! For my microbenchmark, not creating methods is 1.5~2 faster than any of the above:
env HERESY=1 SHARE=1 ruby benchmark.rb => https://benchmark.fyi/y, https://benchmark.fyi/z, https://benchmark.fyi/A

Was making even tiny hashes slow because .methods includes 85 inherited methods even before new_ostruct_member adds any.

jrafanie · 2017-06-21T18:41:38Z

@cben interesting, another option is to cache methods defined on the singleton_class but that's probably not worth the headache. I think your change should work. I"m not sure why the test is failing though.

cben · 2017-06-21T20:43:48Z

Not sure about the Travis failure.
I'm geting the same failure

expected #<Enumerator: #<RecursiveOpenStruct foo="foo", bar=:bar>:each_pair> to match #<Enumerator: {:foo=>"foo", :bar=>:bar}:each_pair>

on master locally, but I see master passed on Travis (and bundler used same dependency versions as I get locally). I can't find info on what rspec does when you simply match enumerators.

aetherknight · 2017-06-22T04:51:55Z

I'm pretty sure that the Travis failure is due to RSpec 3.5.x vs 3.6.x changing something. I updated my Gemfile.lock locally, and the test suite now fails on master.

I don't currently have any benchmarks set up, but feel free to add something and/or suggest a good service for tracking how performance benchmarks change (although I wouldn't rely on metrics gathered from Travis' test runs for benchmarks due to potential variance in the underlying VM).

aetherknight · 2017-06-22T05:30:33Z

I just pushed a minor change to the failing spec to ROS's master that seems to address the issue. Will merge your change in a moment, and then I'll release 1.0.5.

Fix O(n) check making accessing many elements O(n**2)

aa009a0

Was making even tiny hashes slow because .methods includes 85 inherited methods even before new_ostruct_member adds any.

aetherknight merged commit 44ee5c1 into aetherknight:master Jun 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix O(n) check making accessing many elements O(n**2) #54

Fix O(n) check making accessing many elements O(n**2) #54

cben commented Jun 21, 2017 •

edited

Loading

jrafanie commented Jun 21, 2017

cben commented Jun 21, 2017

aetherknight commented Jun 22, 2017

aetherknight commented Jun 22, 2017

Fix O(n) check making accessing many elements O(n**2) #54

Fix O(n) check making accessing many elements O(n**2) #54

Conversation

cben commented Jun 21, 2017 • edited Loading

jrafanie commented Jun 21, 2017

cben commented Jun 21, 2017

aetherknight commented Jun 22, 2017

aetherknight commented Jun 22, 2017

cben commented Jun 21, 2017 •

edited

Loading