Improve serialization performance by avoiding instance_eval and method_missing #73

ivoanjo · 2017-05-17T15:12:03Z

Hello there!

I'm currently working on improving the performance of @Talkdesk systems using Oat for serialization, and after reporting an issue on JRuby relating to Oat performance in some setups (jruby/jruby#4596), @headius suggested that Oat performance was being limited by its usage of instance_eval and method_missing.

After other optimization work, time spent for serialization on our system is now quite a big slice of total request serving time, and thus I decided to finally look into improving Oat's performance by avoiding the two pitfalls mentioned above.

This PR includes two optimizations to Oat. Together, they improve Oat performance on the benchmark below by 41% on MRI and 252% on JRuby when compared to the master branch. See the individual commit messages for more specifics on each optimization.

Benchmark used (using benchmark-ips):

require 'benchmark/ips'
require 'oat'
require 'oat/adapters/hal'

class Foo
  attr_reader :foo, :bar, :baz

  def initialize(**values)
    values.each do |(key, value)|
      respond_to?(key) ? instance_variable_set(:"@#{key}", value) : raise
    end
  end
end

class FooSerializer < Oat::Serializer
  adapter Oat::Adapters::HAL

  schema do
    property :foo, item.foo
    property :bar, item.bar
    property :baz, item.baz
  end
end

puts "Running with #{RUBY_DESCRIPTION}"

FOO = Foo.new(foo: 1, bar: 2, baz: 'baz')

Benchmark.ips do |benchmark|
  benchmark.time = 30
  benchmark.warmup = 30
  benchmark.report('FooSerializer') { FooSerializer.new(FOO).to_hash }
  benchmark.compare!
end

I was unsure if it was useful to also commit the benchmark itself to Oat's repository, so if you think it's useful I can add it to this PR.

Thanks for Oat (it's really useful), and have an awesome day!

@headius

While investigating a performance issue in JRuby when using oat in a specific JRuby configuration, @headius pointed out that the way oat was using `instance_eval` to apply blocks would have an impact on the library performance, both on MRI and on JRuby. In a nutshell, this happens because while every object behaves as it has a singleton class, the object representing that singleton class is generated lazily, only when really needed. And this happens with `instance_eval` --- it causes the singleton class to be created both on MRI and on JRuby. This can be confirmed on MRI by using ObjectSpace: ```ruby >> require 'objspace' >> ObjectSpace.count_objects[:T_CLASS] => 988 >> Object.new.instance_eval { } => nil >> ObjectSpace.count_objects[:T_CLASS] => 989 >> Object.new.instance_eval { } => nil >> ObjectSpace.count_objects[:T_CLASS] => 990 ``` The same happens in JRuby, where a JVM heap visualization tool such as JVisualVM can observe a new `org.jruby.MetaClass` instance being created on every call to `instance_eval`. To avoid this overhead and improve oat's performance, this PR replaces the `instance_eval` we use to evaluate blocks in the context of a serializer class with methods that are created from the block. This means that when a `schema` is registered, we now generate a new private method from it (e.g. `schema_block_N`) and thus when we're serializing a given object we can just call this method (or methods, if several schemas were registered) instead of `instance_eval`. Because this method is defined on the serializer class, it behaves exactly as the `instance_eval`, but performance is considerably increased. Simple benchmark (using `benchmark-ips` gem): ```ruby require 'benchmark/ips' require 'oat' require 'oat/adapters/hal' class Foo attr_reader :foo, :bar, :baz def initialize(**values) values.each do |(key, value)| respond_to?(key) ? instance_variable_set(:"@#{key}", value) : raise end end end class FooSerializer < Oat::Serializer adapter Oat::Adapters::HAL schema do property :foo, item.foo property :bar, item.bar property :baz, item.baz end end puts "Running with #{RUBY_DESCRIPTION}" FOO = Foo.new(foo: 1, bar: 2, baz: 'baz') Benchmark.ips do |benchmark| benchmark.time = 30 benchmark.warmup = 30 benchmark.report('FooSerializer') { FooSerializer.new(FOO).to_hash } benchmark.compare! end ``` And results on my machine: * Before * MRI 2.4.1: `FooSerializer 264.516k (± 7.9%) i/s - 7.890M in 30.032221s` * JRuby 9.1.9.0 + invokedynamic: `FooSerializer 300.490k (±20.6%) i/s - 8.652M in 30.080662s` * After * MRI 2.4.1: `FooSerializer 326.571k (± 5.8%) i/s - 9.771M in 30.035173s` * JRuby 9.1.9.0 + invokedynamic: `FooSerializer 622.808k (± 3.8%) i/s - 18.680M in 30.039202s` E.g. this speeds up Oat by 1.23x on MRI and 2.07x on JRuby. See also jruby/jruby#4596

@headius

Following up the previous commit, another thing that @headius noted was that Oat's usage of `method_missing` to delegate from the serializer to the adapter was also impacting performance of the serialization process. This `method_missing` operation is used in the Serializer to dispatch the methods used inside a schema block (e.g. `property`, `link`, etc) to the currently selected adapter, which implements them. To improve performance, of this operation, we can "learn" which methods the adapter implementds on the first `method_missing` calls, and from them on we define a method on the serializer class that directly dispatches to the adapter, avoiding the `method_missing` call. Note that this works even if the adapter is changed, as we expect the changed adapter to still answer to the interface (and if it doesn't, we just get a `NoMethodError`). Using the same benchmark included in the previous commit, here is the impact of this optimization: * Before * MRI 2.4.1: `FooSerializer 326.571k (± 5.8%) i/s - 9.771M in 30.035173s` * JRuby 9.1.9.0 + invokedynamic: `FooSerializer 622.808k (± 3.8%) i/s - 18.680M in 30.039202s` * After * MRI 2.4.1: `FooSerializer 374.500k (± 4.1%) i/s - 11.246M in 30.084775s` * JRuby 9.1.9.0 + invokedynamic: `FooSerializer 1.058M (± 4.7%) i/s - 31.695M in 30.042851s` This speeds up Oat by 1.15x on MRI and 1.70x on JRuby. Note that these numbers are relative to the previous commit; relative to current master both improvements yield 1.41x on MRI and 3.52x on JRuby. See also jruby/jruby#4596

ismasan · 2017-06-06T13:23:06Z

This looks good to me, and tests pass so no reason not to merge it. I'll merge and cut a new minor release shortly. Thank you!

ismasan · 2017-06-06T13:26:02Z

Version 0.5.1 released!

ivoanjo · 2017-06-06T16:30:07Z

Awesome news, thanks! 🎉

Ivo Anjo added 2 commits May 17, 2017 15:33

ismasan merged commit a30653c into ismasan:master Jun 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve serialization performance by avoiding instance_eval and method_missing #73

Improve serialization performance by avoiding instance_eval and method_missing #73

ivoanjo commented May 17, 2017

ismasan commented Jun 6, 2017

ismasan commented Jun 6, 2017

ivoanjo commented Jun 6, 2017

Improve serialization performance by avoiding instance_eval and method_missing #73

Improve serialization performance by avoiding instance_eval and method_missing #73

Conversation

ivoanjo commented May 17, 2017

ismasan commented Jun 6, 2017

ismasan commented Jun 6, 2017

ivoanjo commented Jun 6, 2017