-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve serialization performance by avoiding instance_eval and method_missing #73
Merged
ismasan
merged 2 commits into
ismasan:master
from
Talkdesk:improve-serialization-performance
Jun 6, 2017
Merged
Improve serialization performance by avoiding instance_eval and method_missing #73
ismasan
merged 2 commits into
ismasan:master
from
Talkdesk:improve-serialization-performance
Jun 6, 2017
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
While investigating a performance issue in JRuby when using oat in a specific JRuby configuration, @headius pointed out that the way oat was using `instance_eval` to apply blocks would have an impact on the library performance, both on MRI and on JRuby. In a nutshell, this happens because while every object behaves as it has a singleton class, the object representing that singleton class is generated lazily, only when really needed. And this happens with `instance_eval` --- it causes the singleton class to be created both on MRI and on JRuby. This can be confirmed on MRI by using ObjectSpace: ```ruby >> require 'objspace' >> ObjectSpace.count_objects[:T_CLASS] => 988 >> Object.new.instance_eval { } => nil >> ObjectSpace.count_objects[:T_CLASS] => 989 >> Object.new.instance_eval { } => nil >> ObjectSpace.count_objects[:T_CLASS] => 990 ``` The same happens in JRuby, where a JVM heap visualization tool such as JVisualVM can observe a new `org.jruby.MetaClass` instance being created on every call to `instance_eval`. To avoid this overhead and improve oat's performance, this PR replaces the `instance_eval` we use to evaluate blocks in the context of a serializer class with methods that are created from the block. This means that when a `schema` is registered, we now generate a new private method from it (e.g. `schema_block_N`) and thus when we're serializing a given object we can just call this method (or methods, if several schemas were registered) instead of `instance_eval`. Because this method is defined on the serializer class, it behaves exactly as the `instance_eval`, but performance is considerably increased. Simple benchmark (using `benchmark-ips` gem): ```ruby require 'benchmark/ips' require 'oat' require 'oat/adapters/hal' class Foo attr_reader :foo, :bar, :baz def initialize(**values) values.each do |(key, value)| respond_to?(key) ? instance_variable_set(:"@#{key}", value) : raise end end end class FooSerializer < Oat::Serializer adapter Oat::Adapters::HAL schema do property :foo, item.foo property :bar, item.bar property :baz, item.baz end end puts "Running with #{RUBY_DESCRIPTION}" FOO = Foo.new(foo: 1, bar: 2, baz: 'baz') Benchmark.ips do |benchmark| benchmark.time = 30 benchmark.warmup = 30 benchmark.report('FooSerializer') { FooSerializer.new(FOO).to_hash } benchmark.compare! end ``` And results on my machine: * Before * MRI 2.4.1: `FooSerializer 264.516k (± 7.9%) i/s - 7.890M in 30.032221s` * JRuby 9.1.9.0 + invokedynamic: `FooSerializer 300.490k (±20.6%) i/s - 8.652M in 30.080662s` * After * MRI 2.4.1: `FooSerializer 326.571k (± 5.8%) i/s - 9.771M in 30.035173s` * JRuby 9.1.9.0 + invokedynamic: `FooSerializer 622.808k (± 3.8%) i/s - 18.680M in 30.039202s` E.g. this speeds up Oat by 1.23x on MRI and 2.07x on JRuby. See also jruby/jruby#4596
Following up the previous commit, another thing that @headius noted was that Oat's usage of `method_missing` to delegate from the serializer to the adapter was also impacting performance of the serialization process. This `method_missing` operation is used in the Serializer to dispatch the methods used inside a schema block (e.g. `property`, `link`, etc) to the currently selected adapter, which implements them. To improve performance, of this operation, we can "learn" which methods the adapter implementds on the first `method_missing` calls, and from them on we define a method on the serializer class that directly dispatches to the adapter, avoiding the `method_missing` call. Note that this works even if the adapter is changed, as we expect the changed adapter to still answer to the interface (and if it doesn't, we just get a `NoMethodError`). Using the same benchmark included in the previous commit, here is the impact of this optimization: * Before * MRI 2.4.1: `FooSerializer 326.571k (± 5.8%) i/s - 9.771M in 30.035173s` * JRuby 9.1.9.0 + invokedynamic: `FooSerializer 622.808k (± 3.8%) i/s - 18.680M in 30.039202s` * After * MRI 2.4.1: `FooSerializer 374.500k (± 4.1%) i/s - 11.246M in 30.084775s` * JRuby 9.1.9.0 + invokedynamic: `FooSerializer 1.058M (± 4.7%) i/s - 31.695M in 30.042851s` This speeds up Oat by 1.15x on MRI and 1.70x on JRuby. Note that these numbers are relative to the previous commit; relative to current master both improvements yield 1.41x on MRI and 3.52x on JRuby. See also jruby/jruby#4596
This looks good to me, and tests pass so no reason not to merge it. I'll merge and cut a new minor release shortly. Thank you! |
Version 0.5.1 released! |
Awesome news, thanks! 🎉 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello there!
I'm currently working on improving the performance of @Talkdesk systems using Oat for serialization, and after reporting an issue on JRuby relating to Oat performance in some setups (jruby/jruby#4596), @headius suggested that Oat performance was being limited by its usage of
instance_eval
andmethod_missing
.After other optimization work, time spent for serialization on our system is now quite a big slice of total request serving time, and thus I decided to finally look into improving Oat's performance by avoiding the two pitfalls mentioned above.
This PR includes two optimizations to Oat. Together, they improve Oat performance on the benchmark below by 41% on MRI and 252% on JRuby when compared to the master branch. See the individual commit messages for more specifics on each optimization.
Benchmark used (using
benchmark-ips
):I was unsure if it was useful to also commit the benchmark itself to Oat's repository, so if you think it's useful I can add it to this PR.
Thanks for Oat (it's really useful), and have an awesome day!