Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve serialization performance by avoiding instance_eval and method_missing #73

Merged
merged 2 commits into from
Jun 6, 2017
Merged

Improve serialization performance by avoiding instance_eval and method_missing #73

merged 2 commits into from
Jun 6, 2017

Conversation

ivoanjo
Copy link
Contributor

@ivoanjo ivoanjo commented May 17, 2017

Hello there!

I'm currently working on improving the performance of @Talkdesk systems using Oat for serialization, and after reporting an issue on JRuby relating to Oat performance in some setups (jruby/jruby#4596), @headius suggested that Oat performance was being limited by its usage of instance_eval and method_missing.

After other optimization work, time spent for serialization on our system is now quite a big slice of total request serving time, and thus I decided to finally look into improving Oat's performance by avoiding the two pitfalls mentioned above.

This PR includes two optimizations to Oat. Together, they improve Oat performance on the benchmark below by 41% on MRI and 252% on JRuby when compared to the master branch. See the individual commit messages for more specifics on each optimization.

Benchmark used (using benchmark-ips):

require 'benchmark/ips'
require 'oat'
require 'oat/adapters/hal'

class Foo
  attr_reader :foo, :bar, :baz

  def initialize(**values)
    values.each do |(key, value)|
      respond_to?(key) ? instance_variable_set(:"@#{key}", value) : raise
    end
  end
end

class FooSerializer < Oat::Serializer
  adapter Oat::Adapters::HAL

  schema do
    property :foo, item.foo
    property :bar, item.bar
    property :baz, item.baz
  end
end

puts "Running with #{RUBY_DESCRIPTION}"

FOO = Foo.new(foo: 1, bar: 2, baz: 'baz')

Benchmark.ips do |benchmark|
  benchmark.time = 30
  benchmark.warmup = 30
  benchmark.report('FooSerializer') { FooSerializer.new(FOO).to_hash }
  benchmark.compare!
end

I was unsure if it was useful to also commit the benchmark itself to Oat's repository, so if you think it's useful I can add it to this PR.

Thanks for Oat (it's really useful), and have an awesome day!

Ivo Anjo added 2 commits May 17, 2017 15:33
While investigating a performance issue in JRuby when using oat in a
specific JRuby configuration, @headius pointed out that the way oat
was using `instance_eval` to apply blocks would have an impact on
the library performance, both on MRI and on JRuby.

In a nutshell, this happens because while every object behaves as
it has a singleton class, the object representing that singleton
class is generated lazily, only when really needed. And this happens
with `instance_eval` --- it causes the singleton class to be created
both on MRI and on JRuby.

This can be confirmed on MRI by using ObjectSpace:

```ruby
>> require 'objspace'
>> ObjectSpace.count_objects[:T_CLASS]
=> 988
>> Object.new.instance_eval { }
=> nil
>> ObjectSpace.count_objects[:T_CLASS]
=> 989
>> Object.new.instance_eval { }
=> nil
>> ObjectSpace.count_objects[:T_CLASS]
=> 990
```

The same happens in JRuby, where a JVM heap visualization tool such
as JVisualVM can observe a new `org.jruby.MetaClass` instance being
created on every call to `instance_eval`.

To avoid this overhead and improve oat's performance, this PR replaces
the `instance_eval` we use to evaluate blocks in the context of a
serializer class with methods that are created from the block.

This means that when a `schema` is registered, we now generate a new
private method from it (e.g. `schema_block_N`) and thus when we're
serializing a given object we can just call this method (or methods,
if several schemas were registered) instead of `instance_eval`.
Because this method is defined on the serializer class, it behaves
exactly as the `instance_eval`, but performance is considerably
increased.

Simple benchmark (using `benchmark-ips` gem):

```ruby
require 'benchmark/ips'
require 'oat'
require 'oat/adapters/hal'

class Foo
  attr_reader :foo, :bar, :baz

  def initialize(**values)
    values.each do |(key, value)|
      respond_to?(key) ? instance_variable_set(:"@#{key}", value) :
raise
    end
  end
end

class FooSerializer < Oat::Serializer
  adapter Oat::Adapters::HAL

  schema do
    property :foo, item.foo
    property :bar, item.bar
    property :baz, item.baz
  end
end

puts "Running with #{RUBY_DESCRIPTION}"

FOO = Foo.new(foo: 1, bar: 2, baz: 'baz')

Benchmark.ips do |benchmark|
  benchmark.time = 30
  benchmark.warmup = 30
  benchmark.report('FooSerializer') { FooSerializer.new(FOO).to_hash }
  benchmark.compare!
end
```

And results on my machine:

* Before
  * MRI 2.4.1:
    `FooSerializer    264.516k (± 7.9%) i/s -      7.890M in  30.032221s`
  * JRuby 9.1.9.0 + invokedynamic:
    `FooSerializer    300.490k (±20.6%) i/s -      8.652M in  30.080662s`

* After
  * MRI 2.4.1:
    `FooSerializer    326.571k (± 5.8%) i/s -      9.771M in  30.035173s`
  * JRuby 9.1.9.0 + invokedynamic:
    `FooSerializer    622.808k (± 3.8%) i/s -     18.680M in  30.039202s`

E.g. this speeds up Oat by 1.23x on MRI and 2.07x on JRuby.

See also jruby/jruby#4596
Following up the previous commit, another thing that @headius noted
was that Oat's usage of `method_missing` to delegate from the
serializer to the adapter was also impacting performance of the
serialization process.

This `method_missing` operation is used in the Serializer to dispatch
the methods used inside a schema block (e.g. `property`, `link`, etc)
to the currently selected adapter, which implements them.

To improve performance, of this operation, we can "learn" which
methods the adapter implementds on the first `method_missing` calls,
and from them on we define a method on the serializer class that
directly dispatches to the adapter, avoiding the `method_missing`
call.

Note that this works even if the adapter is changed, as we expect
the changed adapter to still answer to the interface (and if it
doesn't, we just get a `NoMethodError`).

Using the same benchmark included in the previous commit, here is
the impact of this optimization:

* Before
  * MRI 2.4.1:
    `FooSerializer    326.571k (± 5.8%) i/s -      9.771M in  30.035173s`
  * JRuby 9.1.9.0 + invokedynamic:
    `FooSerializer    622.808k (± 3.8%) i/s -     18.680M in  30.039202s`

* After
  * MRI 2.4.1:
    `FooSerializer    374.500k (± 4.1%) i/s -     11.246M in  30.084775s`
  * JRuby 9.1.9.0 + invokedynamic:
    `FooSerializer      1.058M (± 4.7%) i/s -     31.695M in  30.042851s`

This speeds up Oat by 1.15x on MRI and 1.70x on JRuby. Note that these
numbers are relative to the previous commit; relative to current master
both improvements yield 1.41x on MRI and 3.52x on JRuby.

See also jruby/jruby#4596
@ismasan
Copy link
Owner

ismasan commented Jun 6, 2017

This looks good to me, and tests pass so no reason not to merge it. I'll merge and cut a new minor release shortly. Thank you!

@ismasan ismasan merged commit a30653c into ismasan:master Jun 6, 2017
@ismasan
Copy link
Owner

ismasan commented Jun 6, 2017

Version 0.5.1 released!

@ivoanjo
Copy link
Contributor Author

ivoanjo commented Jun 6, 2017

Awesome news, thanks! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants