Very high performance overhead on JRuby #640

guizmaii · 2018-11-27T14:09:17Z

Hi Datadog,

I'm migrating a Rails (4.2.10) app from MRI Ruby 2.3.8 and Unicorn to JRuby 9.1.17.0 and Puma.
The dd-trace-rb version used is 0.17.2

While doing some performance tests, I saw a big difference between my benchmarks in dev mode and prod mode with JRuby:

dev mode with JRuby/Puma: ~280req/s
prod mode with JRuby/Puma: ~100req/s .

I found that disabling the Datadog agent (by surrounding the Datadog.configure do |c| ... code in the Rails initializer with an if false) restored the prod mode performances (~320req/s) !

I also tested the differences in prod mode with MRI Ruby and Unicorn between Datadog agent enabled vs disabled.
While there's a difference, it's not as huge as with JRuby/ Puma:

prod mode, Datadog enabled: ~100req/s
prod mode, Datadog disabled: ~150req/s

So, let's summarise these numbers:

Datadog Agent	JRuby / Puma perfs (prod mode)	MRI Ruby / Unicorn (prod mode)
enabled	~100 req/s	~100 req/s
disabled	~320 req/s	~150 req/s

Another interesting thing to notice is that CPU usage with JRuby/Puma is not at 100% when dd-trace-rb is disabled while it's full 100% when enabled. 🤔

I should precise that the Datadog agent is not installed in my OS, so all calls from the dd-trace-rb in my app to this OS agent failed.
Maybe, this could be the reason of the problem. I don't know for now. I still have to test that.

Here's how I configure dd-trace-rb:

Gemfile

gem 'ddtrace'

myapp/config/initializers/datadog-tracer.rb

if ! (Rails.env.test? || Rails.env.development?)
  Datadog.configure do |c|
    c.tracer enabled: Rails.env.production? || Rails.env.staging? || Rails.env.testing?, env: Rails.env

    c.use :rails, service_name: "#{Settings.datadog.service_name}-rails", distributed_tracing: true

    c.use :mysql2, service_name: "#{Settings.datadog.service_name}-mysql2"
    c.use :aws, service_name: "#{Settings.datadog.service_name}-aws"
    c.use :grape, service_name: "#{Settings.datadog.service_name}-grape"
    c.use :redis, service_name: "#{Settings.datadog.service_name}-redis"
    c.use :resque, service_name: "#{Settings.datadog.service_name}-resque", workers: [ ... ]
  end
end

Here's the script I use to benchmark.
It uses this tool https://github.com/giltene/wrk2

#!/usr/bin/env bash

trap "exit" INT # https://stackoverflow.com/a/32146079/2431728

echo "WARM UP"

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

echo
echo "========="
echo "BENCH 1"
echo "========="
echo

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

echo
echo "========="
echo "BENCH 2"
echo "========="
echo

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

echo
echo "========="
echo "BENCH 3"
echo "========="
echo

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

Raw benchmark results for JRuby/Puma with dd-trace-rb disabled (sorry it's small)

Show data

screenshot 2018-11-27 12 31 28 - jruby_9 1 17 0_prod_datadog_disabled-1

screenshot 2018-11-27 12 31 22 - jruby_9 1 17 0_prod_datadog_disabled-2

screenshot 2018-11-27 12 30 45 - jruby_9 1 17 0_prod_datadog_disabled-3

Raw benchmark results for JRuby/Puma with dd-trace-rb enabled (sorry it's small)

Show data

screenshot 2018-11-27 12 43 19 - jruby_9 1 17 0_prod_datadog_enabled-1

screenshot 2018-11-27 12 43 13 - jruby_9 1 17 0_prod_datadog_enabled-2

screenshot 2018-11-27 12 43 08 - jruby_9 1 17 0_prod_datadog_enabled-3

Raw benchmark results for MRI/Unicorn with dd-trace-rb disabled (sorry it's small)

Show data

screenshot 2018-11-27 14 06 39 - mri2 3 8_prod_datadog_disabled-1

screenshot 2018-11-27 14 06 32 - mri2 3 8_prod_datadog_disabled-2

screenshot 2018-11-27 14 06 26 - mri2 3 8_prod_datadog_disabled-3

Raw benchmark results for MRI/Unicorn with dd-trace-rb enabled (sorry it's small)

Show data

screenshot 2018-11-27 13 50 17 - mri2 3 8_prod_datadog_enabled-1

screenshot 2018-11-27 13 50 12 - mri2 3 8_prod_datadog_enabled-2

screenshot 2018-11-27 13 50 06 - mri2 3 8_prod_datadog_enabled-3

The text was updated successfully, but these errors were encountered:

delner · 2018-11-27T17:28:27Z

Thanks for the report @guizmaii! Right now our JRuby support is experimental, and doesn't receive routine attention. Before we'd move it to full support, we'll want to tackle this issue and confirm it's meeting performance expectations.

That said, we're more than happy to tackle any specific performance issues you might be able to pinpoint in your own tests. Are there any particular integrations or combination of integrations that you've seen affect performance the most? Any issues you can zero in will most likely be investigated & fixed sooner, rather than later.

Feel free to open any GitHub issues for each specific issue you find, and I'll work with you to figure them out!

guizmaii · 2018-11-27T19:56:05Z

Hi @delner,

Thanks for your answer. I'll answer your question later. Now I want to give more results of my different tests.

As indicated in my initial message, one thing I still had to test was:

I should precise that the Datadog agent is not installed in my OS, so all calls from the dd-trace-rb in my app to this OS agent failed.
Maybe, this could be the reason of the problem. I don't know for now. I still have to test that.

So, I've done these tests and here are the results !

First, Here's how I launch a Datadog OS agent on my computer:

#!/usr/bin/env bash

API_KEY=<the_api_key>

docker run -d -v /var/run/docker.sock:/var/run/docker.sock:ro \
              -v /proc/:/host/proc/:ro \
              -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
              -e DD_API_KEY=${API_KEY} \
              -e DD_APM_ENABLED=true \
              -e DD_TAGS=env:jules_jruby_benchs \
              -e DD_DOGSTATSD_NON_LOCAL_TRAFFIC=true \
              -e SD_JMX_ENABLE=true \
              -p 8126:8126/tcp \
              -p 8125:8125/udp \
              --name datadog \
              datadog/agent:latest-jmx

I'm sure that this config is correct because I can see data in my Datadog dashboard 😉

Now the results of these tests with the OS agent launched !

Because I saw that with 16 Puma threads (the config I used in the initial message of this issue) the CPU wasn't used at 100% with dd-trace-rb disabled while the CPU usage is full 100% when enabled, I tested with different thread configurations.

Measurement	dd-trace-rb	JRuby/Puma 16 threads	JRuby/Puma 8 threads	JRuby/Puma 4 threads
throughput	enabled	~115 req/s	~100req/s	~80req/s
latency 50%	enabled	~22 s/req ❗️	~10 s/req ❗️	~42 ms/req ✅
latency 100%	enabled	~28 s/req ❗️	~17 s/req ❗️	~95 ms/req ✅
throughput	disabled	~320 req/s	~160req/s	not tested
latency 50%	disabled	~19 ms/req ✅	~19 ms/req ✅	not tested
latency 100%	disabled	~432 ms/req ✅	~92 ms/req ✅	not tested

We can see that performances here are comparable to the ones reported in the initial message of this issue.

It's important because I talked with the creators and maintainers of JRuby and their first guess was that the dd-trace-rb generated a lot of exceptions because of the missing OS agent.
Because exceptions are not cheap on JRuby, they thought that these exceptions could be the root of these performance problems.
(see here: https://twitter.com/tom_enebo/status/1067436117580951554)
So, what these new numbers say is that: no it's not potential exceptions generated by dd-trace-rb due to the missing OS agent that gives rise to these performance problems.

I'll continue to look for the reasons of these problems. Hopefully, I'll succeed to have an open-source project that reproduces these problems.

Raw benchmark results for JRuby/Puma 16 threads with dd-trace-rb enabled (sorry it's small)

Show data

screenshot 2018-11-27 18 52 53 - jruby9 1 17 0_16threads_prod_datadog_enable_and_os_agent_enabled-1

screenshot 2018-11-27 18 52 47 - jruby9 1 17 0_16threads_prod_datadog_enable_and_os_agent_enabled-2

screenshot 2018-11-27 18 52 41 - jruby9 1 17 0_16threads_prod_datadog_enable_and_os_agent_enabled-3

Raw benchmark results for JRuby/Puma 8 threads with dd-trace-rb enabled (sorry it's small)

Show data

screenshot 2018-11-27 19 06 17 - jruby9 1 17 0_8threads_prod_datadog_enable_and_os_agent_enabled-1

screenshot 2018-11-27 19 06 06 - jruby9 1 17 0_8threads_prod_datadog_enable_and_os_agent_enabled-2

screenshot 2018-11-27 19 06 00 - jruby9 1 17 0_8threads_prod_datadog_enable_and_os_agent_enable
d-3

Raw benchmark results for JRuby/Puma 4 threads with dd-trace-rb enabled (sorry it's small)

Show data

screenshot 2018-11-27 19 19 38 - jruby9 1 17 0_4threads_prod_datadog_enable_and_os_agent_enabled-1

screenshot 2018-11-27 19 19 32 - jruby9 1 17 0_4threads_prod_datadog_enable_and_os_agent_enabled-2

screenshot 2018-11-27 19 19 28 - jruby9 1 17 0_4threads_prod_datadog_enable_and_os_agent_enabled-3

Raw benchmark results for JRuby/Puma 16 threads with dd-trace-rb disabled (sorry it's small)

See first message of this issue: #640 (comment)

Raw benchmark results for JRuby/Puma 8 threads with dd-trace-rb disabled (sorry it's small)

Show data

screenshot 2018-11-27 19 55 36 - jruby9 1 17 0_8threads_prod_datadog_disable-1

screenshot 2018-11-27 19 55 30 - jruby9 1 17 0_8threads_prod_datadog_disable-2

screenshot 2018-11-27 19 55 25 - jruby9 1 17 0_8threads_prod_datadog_disable-3

headius · 2018-11-27T20:17:11Z

It's possible there are still exceptions or Kernel#caller being used. Did the backtrace stuff go away in --sample output?

These numbers are rough...other than exception traces, maybe excessive locking?

delner · 2018-11-27T20:19:36Z

We do use a lot of locks, and there might be some places where it might not be needed. Maybe this doesn't affect MRI as much as it affects JRuby, but it might be worth experimenting with that locking to see if it makes a difference in throughput while maintaining thread safety.

guizmaii · 2018-11-27T20:53:07Z

@headius

It's possible there are still exceptions or Kernel#caller being used. Did the backtrace stuff go away in --sample output?

I didn't have the time today to test that. I'll do it tomorrow ! 🙂

enebo · 2018-11-27T21:37:17Z

@guizmaii Just glancing at datadog and I see in case of a log message with severity of ERROR it will call caller to provide more context. This will have same overhead of generating an exception (cost of exceptions in JRuby are based on having to construct a backtrace -- which is same basic cost for caller). Not saying it is this but replacing c = caller with c=1 in the gem and rerunning may be a super quick test.

"Not saying it is this but replacing c = caller with c=1 in the gem and rerunning may be a super quick test"

guizmaii · 2018-11-28T18:59:08Z

Here are some answers I can give you after my day of work:

@enebo "replacing c = caller with c=1" doesn't change anything

@delner It seems that it's the :rails integration which is problematic.

When I remove the c.use :rails, service_name: "#{Settings.datadog.service_name}-rails", distributed_tracing: true line from my config, performances come back.

I have an open-source project where I try to reproduce the problem here: https://github.com/guizmaii/jruby_rails_datadog

For now, it seems that I'm not able to reproduce it. The example is maybe too simplistic. I still have to work on that.

I'm aslo experimenting https://github.com/jvm-profiling-tools/async-profiler to profile the execution with the following script:

#!/usr/bin/env bash

trap "exit" INT # https://stackoverflow.com/a/32146079/2431728

timestamp() {
  date +%Y-%m-%d_%H-%M-%S
}

echo "WARM UP"

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

echo
echo "*** Launch the profiler"

~/tools/async-profiler-1.4-macos-x64/profiler.sh start -b 50000000 $(cat /tmp/web_server.pid)

echo
echo "*** Profiler status:"

~/tools/async-profiler-1.4-macos-x64/profiler.sh status $(cat /tmp/web_server.pid)

echo
echo "========="
echo "BENCH 1"
echo "========="
echo

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

echo
echo "========="
echo "BENCH 2"
echo "========="
echo

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

echo
echo "========="
echo "BENCH 3"
echo "========="
echo

echo
echo "*** stop the profiler"

~/tools/async-profiler-1.4-macos-x64/profiler.sh stop -o summary,flat=200 -f traces-$(timestamp).txt  $(cat /tmp/web_server.pid)

delner · 2018-11-28T20:54:41Z

Okay good to know. The Rails integration itself consists of a few different distinct parts, which we might want to activate independently from one another, to see if performance issues comes from one particular component or not.

You should be able to activate Rack in one test using use :rack, and ActiveRecord in another using use :active_record. For the components that are internal to Rails, like template rendering and controllers however, you'd have to probably fork ddtrace and modify our Rails patcher so it only activates each of those parts independently from the other.

If we can figure out how each component contributes to the performance problems, it might help us identify some specific area we can improve upon.

guizmaii · 2018-11-29T11:11:46Z

Hey,

Thanks for the info @delner. I'll work on that today !

@enebo In fact, you were right about exceptions.

Here's the result of a profiled execution with dd-trace-rb enabled:

Show data

--- Execution profile ---
Total samples:         37271
Non-Java:              146 (0.39%)
Unknown (native):      31 (0.08%)
Unknown (Java):        24 (0.06%)
Skipped:               1 (0.00%)

Frame buffer usage:    18.6893%

          ns  percent  samples  top
  ----------  -------  -------  ---
104450000000   28.03%    10445  java.net.PlainSocketImpl.socketAvailable
 82630000000   22.17%     8263  java.lang.Throwable.getStackTraceElement
 73370000000   19.69%     7337  com.kenai.jffi.Foreign.invokeN3O1
 26690000000    7.16%     2669  jnr.posix.UnixLibC$jnr$ffi$0.flock
 10060000000    2.70%     1006  sun.misc.Unsafe.park
  9650000000    2.59%      965  sun.misc.Unsafe.unpark
  5830000000    1.56%      583  java.lang.Throwable.fillInStackTrace
  3200000000    0.86%      320  sun.nio.ch.FileDispatcherImpl.write0
  3020000000    0.81%      302  java.io.UnixFileSystem.getBooleanAttributes0
  2580000000    0.69%      258  java.util.concurrent.ConcurrentHashMap.get
  2330000000    0.63%      233  java.util.HashMap.getNode
  1940000000    0.52%      194  java.io.FileInputStream.readBytes
  1910000000    0.51%      191  org.jruby.runtime.callsite.CachingCallSite.call
  1600000000    0.43%      160  org.jruby.runtime.backtrace.BacktraceData.packagedFilenameFromElement
  1530000000    0.41%      153  org.jruby.runtime.callsite.CachingCallSite.call
  1450000000    0.39%      145  java.net.SocketInputStream.socketRead0
  1260000000    0.34%      126  java.lang.invoke.MethodHandle.invokeBasic
  1030000000    0.28%      103  org.jruby.util.JavaNameMangler.demangleMethodNameInternal
  1000000000    0.27%      100  sun.nio.ch.KQueueArrayWrapper.register0
   850000000    0.23%       85  org.jruby.RubyModule.searchWithCache
   710000000    0.19%       71  org.jruby.util.StringSupport.split
   650000000    0.17%       65  java.lang.invoke.LambdaForm$MH.1161667116.invokeExact_MT
   610000000    0.16%       61  org.jruby.runtime.backtrace.BacktraceData.constructBacktrace
   590000000    0.16%       59  org.jruby.RubyHash.internalGetEntry
   560000000    0.15%       56  org.jruby.internal.runtime.methods.CompiledIRMethod.call
   540000000    0.14%       54  java.net.SocketOutputStream.socketWrite0
   510000000    0.14%       51  org.jruby.internal.runtime.methods.MixedModeIRMethod.call
   420000000    0.11%       42  org.jruby.ir.runtime.IRRuntimeHelpers.getVariableWithAccessor
   410000000    0.11%       41  com.kenai.jffi.Foreign.invokeN2O1
   400000000    0.11%       40  org.jruby.internal.runtime.methods.AttrReaderMethod.call
   370000000    0.10%       37  org.jruby.runtime.callsite.CachingCallSite.call
   330000000    0.09%       33  java.lang.invoke.LambdaForm$MH.2011482127.invokeExact_MT
   320000000    0.09%       32  jshort_disjoint_arraycopy
   290000000    0.08%       29  org.jruby.internal.runtime.methods.DynamicMethod.call
   280000000    0.08%       28  org.jruby.internal.runtime.methods.DynamicMethod.call
   260000000    0.07%       26  java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry
   240000000    0.06%       24  java.lang.invoke.LambdaForm$MH.673068808.invokeExact_MT
   230000000    0.06%       23  org.jruby.RubyBasicObject.getMetaClass
   230000000    0.06%       23  org.jruby.internal.runtime.methods.MixedModeIRMethod.call
   230000000    0.06%       23  java.lang.invoke.LambdaForm$MH.532445947.invokeExact_MT
   230000000    0.06%       23  java.lang.ThreadLocal$ThreadLocalMap.cleanSomeSlots
   220000000    0.06%       22  org.jruby.runtime.ivars.VariableAccessorVar0.get
   220000000    0.06%       22  java.lang.System.identityHashCode
   210000000    0.06%       21  itable stub
   200000000    0.05%       20  vtable stub
   200000000    0.05%       20  org.jruby.internal.runtime.methods.DynamicMethod.isCallableFrom
   200000000    0.05%       20  org.jruby.IncludedModuleWrapper.isSame
   200000000    0.05%       20  jlong_disjoint_arraycopy
   190000000    0.05%       19  org.jruby.util.ArraySupport.copy
   180000000    0.05%       18  java.lang.String.valueOf
   170000000    0.05%       17  vtable stub
   170000000    0.05%       17  org.jruby.util.ByteList.ensure
   170000000    0.05%       17  org.jruby.runtime.MixedModeIRBlockBody.setEvalType
   160000000    0.04%       16  org.jruby.RubyModule.searchWithCache
   160000000    0.04%       16  org.jruby.RubyHash.internalPutNoResize
   150000000    0.04%       15  org.jruby.parser.StaticScope.acquireConstructor
   150000000    0.04%       15  org.jruby.runtime.ivars.VariableAccessorVar1.get
   150000000    0.04%       15  vtable stub
   140000000    0.04%       14  org.jruby.runtime.backtrace.RubyStackTraceElement.asStackTraceElement
   140000000    0.04%       14  jbyte_disjoint_arraycopy
   140000000    0.04%       14  vtable stub
   140000000    0.04%       14  org.jruby.RubyBasicObject.eql
   130000000    0.03%       13  org.jruby.RubyModule.hasModuleInHierarchy
   130000000    0.03%       13  org.jruby.internal.runtime.ThreadService.getCurrentContext
   130000000    0.03%       13  org.jruby.internal.runtime.methods.CompiledIRMethod.call
   130000000    0.03%       13  java.lang.ThreadLocal.setInitialValue
   130000000    0.03%       13  org.jruby.util.ByteList.equal
   130000000    0.03%       13  itable stub
   130000000    0.03%       13  org.jruby.internal.runtime.methods.CompiledIRMethod.call
   120000000    0.03%       12  org.jruby.runtime.callsite.CachingCallSite.call
   120000000    0.03%       12  org.jruby.internal.runtime.methods.CompiledIRMethod.call
   120000000    0.03%       12  sun.nio.cs.UTF_8$Encoder.encodeLoop
   120000000    0.03%       12  org.jruby.ir.runtime.IRRuntimeHelpers.constructHashFromArray
   120000000    0.03%       12  java.lang.invoke.DirectMethodHandle.allocateInstance
   110000000    0.03%       11  java.lang.Throwable.getOurStackTrace
   110000000    0.03%       11  org.jruby.runtime.backtrace.BacktraceData.getBacktrace
   110000000    0.03%       11  org.jruby.RubyModule.searchMethodInner
   110000000    0.03%       11  org.jruby.internal.runtime.methods.MixedModeIRMethod.call
   110000000    0.03%       11  vtable stub
   100000000    0.03%       10  org.jruby.runtime.backtrace.FrameType.getInterpreterFrame
   100000000    0.03%       10  sun.nio.cs.UTF_8$Decoder.decodeArrayLoop
   100000000    0.03%       10  org.jruby.runtime.DynamicScope.getValueOrNil
   100000000    0.03%       10  java.lang.invoke.LambdaForm$DMH.1262481834.invokeStatic_L8_L
   100000000    0.03%       10  org.jruby.internal.runtime.methods.MixedModeIRMethod.call
   100000000    0.03%       10  org.jruby.runtime.callsite.CachingCallSite.call
   100000000    0.03%       10  org.jruby.runtime.backtrace.BacktraceData.getBoundMethodName
   100000000    0.03%       10  org.jruby.Ruby$3.setValue
   100000000    0.03%       10  org.jruby.RubyHash$INVOKER$i$1$0$op_aref_DBG.call
   100000000    0.03%       10  itable stub
   100000000    0.03%       10  Users.jules.$_dot_rvm.gems.jruby_minus_9_dot_1_dot_17_dot_0.gems.ddtrace_minus_0_dot_17_dot_2.lib.ddtrace.span.RUBY$method$initialize$0
    90000000    0.02%        9  org.jruby.RubyBasicObject.variableTableContains
    90000000    0.02%        9  org.jruby.runtime.callsite.CachingCallSite.call
    90000000    0.02%        9  org.jruby.ext.thread.Mutex.synchronize
    90000000    0.02%        9  java.util.concurrent.ConcurrentHashMap.transfer
    90000000    0.02%        9  org.jruby.IncludedModule.getNonIncludedClass
    80000000    0.02%        8  org.jruby.runtime.ivars.VariableTableManager.syncVariables
    80000000    0.02%        8  org.jruby.util.io.EncodingUtils.encCrStrBufCat
    80000000    0.02%        8  org.jruby.RubyRegexp$SearchMatchTask.run
    80000000    0.02%        8  vtable stub
    80000000    0.02%        8  sun.nio.ch.FileDispatcherImpl.read0
    80000000    0.02%        8  vtable stub
    80000000    0.02%        8  org.jruby.runtime.callsite.CachingCallSite.cacheAndCall
    80000000    0.02%        8  org.joni.ByteCodeMachine.executeSb
    80000000    0.02%        8  org.jruby.runtime.BlockBody.yield
    80000000    0.02%        8  org.jruby.RubyHash.internalPut
    70000000    0.02%        7  org.jruby.runtime.Signature.checkArity
    70000000    0.02%        7  org.jruby.RubyArray.aryDup
    70000000    0.02%        7  org.jruby.util.JavaNameMangler.decodeMethodForBacktrace
    70000000    0.02%        7  org.jruby.RubyHash.internalCopyTable
    70000000    0.02%        7  org.jruby.runtime.CompiledIRBlockBody.canCallDirect
    70000000    0.02%        7  org.jruby.runtime.ivars.VariableAccessor.getVariable
    70000000    0.02%        7  org.jruby.internal.runtime.methods.DynamicMethod.call
    70000000    0.02%        7  sun.nio.cs.UTF_8$Encoder.encodeArrayLoop
    70000000    0.02%        7  org.jruby.RubySymbol$SymbolTable.getSymbol
    70000000    0.02%        7  org.jruby.runtime.DynamicScope.getValueDepthZeroOrNil
    70000000    0.02%        7  org.jruby.RubyObject$11.allocate
    70000000    0.02%        7  java.util.Vector.grow
    70000000    0.02%        7  org.jruby.runtime.ivars.VariableAccessorVar2.get
    70000000    0.02%        7  org.jruby.RubyModule.newMethod
    70000000    0.02%        7  org.jruby.RubyHash.fastASetCheckString
    70000000    0.02%        7  vtable stub
    70000000    0.02%        7  vtable stub
    60000000    0.02%        6  org.jruby.runtime.callsite.CachingCallSite.call
    60000000    0.02%        6  org.jruby.RubyModule$INVOKER$i$1$0$op_eqq_DBG.call
    60000000    0.02%        6  org.jruby.runtime.Block.call
    60000000    0.02%        6  java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss
    60000000    0.02%        6  org.jruby.RubyHash.op_asetSmallForString
    60000000    0.02%        6  org.jruby.RubyArray.eltOk
    60000000    0.02%        6  java.lang.invoke.LambdaForm$DMH.1161323211.invokeStatic_L6_L
    60000000    0.02%        6  org.jruby.specialized.RubyArrayOneObject.eltInternal
    60000000    0.02%        6  org.jruby.RubyClass.allocate
    60000000    0.02%        6  org.jruby.util.Sprintf.rubySprintfToBuffer
    60000000    0.02%        6  org.jruby.runtime.IRBlockBody.setEvalType
    60000000    0.02%        6  Users.jules.$_dot_rvm.gems.jruby_minus_9_dot_1_dot_17_dot_0.gems.ddtrace_minus_0_dot_17_dot_2.lib.ddtrace.contrib.active_record.utils.RUBY$method$connection_by_id$0
    60000000    0.02%        6  org.jruby.ir.runtime.IRRuntimeHelpers.extractKwargsHash
    60000000    0.02%        6  org.jruby.RubyHash.internalGet
    60000000    0.02%        6  org.jruby.RubyArray.eltInternal
    60000000    0.02%        6  org.jruby.runtime.ivars.VariableAccessor.get
    60000000    0.02%        6  java.nio.HeapCharBuffer.toString
    60000000    0.02%        6  java.lang.invoke.LambdaForm$DMH.1563515853.invokeStatic_L8_L
    60000000    0.02%        6  java.lang.invoke.LambdaForm$BMH.1282473384.reinvoke
    60000000    0.02%        6  org.jruby.runtime.Block.setEvalType
    60000000    0.02%        6  org.jruby.runtime.CallBlock.doYield
    60000000    0.02%        6  org.jruby.internal.runtime.methods.DynamicMethod.call
    50000000    0.01%        5  org.jruby.RubyModule$KindOf.isKindOf
    50000000    0.01%        5  org.jruby.runtime.ThreadContext.safeRecurseOutermost
    50000000    0.01%        5  vtable stub
    50000000    0.01%        5  Users.jules.$_dot_rvm.gems.jruby_minus_9_dot_1_dot_17_dot_0.gems.activerecord_minus_4_dot_2_dot_11.lib.active_record.relation.query_methods.RUBY$method$check_cached_relation$0
    50000000    0.01%        5  org.jruby.runtime.ivars.StampedVariableAccessor.createTableUnsafe
    50000000    0.01%        5  java.lang.invoke.LambdaForm$BMH.1459672753.reinvoke
    50000000    0.01%        5  org.jruby.RubyRandom.randLimitedBignum
    50000000    0.01%        5  org.joda.time.tz.CachedDateTimeZone$Info.getOffset
    50000000    0.01%        5  Users.jules.$_dot_rvm.gems.jruby_minus_9_dot_1_dot_17_dot_0.gems.ddtrace_minus_0_dot_17_dot_2.lib.ddtrace.context.RUBY$method$current_span$0
    50000000    0.01%        5  org.jruby.RubyObjectVar0.getVariable0
    50000000    0.01%        5  org.jruby.RubyBasicObject.isNil
    50000000    0.01%        5  org.jruby.util.StringSupport.codeRangeScanAsciiCompatible
    50000000    0.01%        5  org.jruby.runtime.ivars.VariableAccessorVar5.get
    50000000    0.01%        5  org.jruby.runtime.ivars.VariableAccessorVar6.get
    50000000    0.01%        5  org.jruby.ir.runtime.IRRuntimeHelpers.receiveRestArg
    50000000    0.01%        5  org.jruby.runtime.ivars.StampedVariableAccessor.set
    50000000    0.01%        5  vtable stub
    50000000    0.01%        5  org.jruby.internal.runtime.methods.DynamicMethod.call
    50000000    0.01%        5  org.jruby.RubyArray$INVOKER$i$0$0$each_DBG.call
    50000000    0.01%        5  org.jruby.runtime.ivars.VariableAccessorVar3.get
    50000000    0.01%        5  java.lang.invoke.LambdaForm$BMH.485041780.reinvoke
    50000000    0.01%        5  org.jruby.RubyArray.newArray
    50000000    0.01%        5  org.jruby.RubyString.makeShared19
    50000000    0.01%        5  java.lang.invoke.LambdaForm$DMH.2111457497.invokeStatic_L7_L
    50000000    0.01%        5  org.jruby.ir.runtime.IRRuntimeHelpers.getBlockFromObject
    50000000    0.01%        5  org.jruby.runtime.ivars.VariableAccessorVar4.get
    40000000    0.01%        4  org.jruby.ir.runtime.IRRuntimeHelpers.setVariableWithAccessor
    40000000    0.01%        4  org.jruby.RubyObjectVar9.getVariable7
    40000000    0.01%        4  org.jruby.RubyString.asRegexpArg
    40000000    0.01%        4  java.math.BigInteger.stripLeadingZeroBytes
    40000000    0.01%        4  java.lang.Thread.interrupted
    40000000    0.01%        4  org.jruby.runtime.Helpers.aryToAry
    40000000    0.01%        4  org.jruby.runtime.opto.FailoverSwitchPointInvalidator.invalidate
    40000000    0.01%        4  org.jruby.RubyArray.concat
    40000000    0.01%        4  org.jruby.RubyModule.findImplementer
    40000000    0.01%        4  org.jruby.RubyArray$INVOKER$i$1$0$op_plus_DBG.call
    40000000    0.01%        4  org.jruby.runtime.callsite.CachingCallSite.call
    40000000    0.01%        4  org.jruby.util.TypeConverter.checkArrayType
    40000000    0.01%        4  org.jruby.ext.thread_safe.jsr166e.ConcurrentHashMapV8.internalGet
    40000000    0.01%        4  org.jruby.RubySymbol$INVOKER$i$1$0$op_equal_DBG.call
    40000000    0.01%        4  org.jruby.RubyArray.flatten_bang
    40000000    0.01%        4  org.jruby.runtime.callsite.RespondToCallSite.respondsTo
    40000000    0.01%        4  org.jruby.runtime.backtrace.FrameType.getInterpreterFrame
    40000000    0.01%        4  java.lang.invoke.LambdaForm$DMH.1145758761.invokeStatic_L7_L
    40000000    0.01%        4  org.jruby.RubyClass.superclass
    40000000    0.01%        4  com.mysql.jdbc.ConnectionImpl.execSQL
    40000000    0.01%        4  org.jruby.runtime.ThreadContext.pushCallFrame
    40000000    0.01%        4  org.jruby.runtime.ivars.VariableAccessorVar1.set
    40000000    0.01%        4  java.lang.invoke.LambdaForm$DMH.1593300381.invokeStatic_L6_L
    40000000    0.01%        4  java.lang.CharacterDataLatin1.getProperties
    40000000    0.01%        4  org.jruby.specialized.RubyArrayOneObject.copyInto
    40000000    0.01%        4  org.jruby.Ruby$3.getValue
    40000000    0.01%        4  org.jruby.internal.runtime.methods.MixedModeIRMethod.call
    40000000    0.01%        4  org.jruby.ir.runtime.IRRuntimeHelpers.toAry
    40000000    0.01%        4  org.jruby.internal.runtime.methods.DynamicMethod.call
    40000000    0.01%        4  itable stub

and here the result of a profiled execution with dd-trace-rb disabled:

Show data

--- Execution profile ---
Total samples:         15001
Non-Java:              95 (0.63%)
Unknown (native):      62 (0.41%)
Unknown (Java):        7 (0.05%)

Frame buffer usage:    11.0518%

          ns  percent  samples  top
  ----------  -------  -------  ---
 66850000000   44.56%     6685  java.net.PlainSocketImpl.socketAvailable
 20340000000   13.56%     2034  jnr.posix.UnixLibC$jnr$ffi$0.flock
  4580000000    3.05%      458  sun.misc.Unsafe.park
  4530000000    3.02%      453  java.io.UnixFileSystem.getBooleanAttributes0
  2900000000    1.93%      290  org.jruby.runtime.ThreadContext.setErrorInfo
  2160000000    1.44%      216  com.kenai.jffi.Foreign.invokeN3O1
  1930000000    1.29%      193  java.util.concurrent.ConcurrentHashMap.get
  1830000000    1.22%      183  sun.misc.Unsafe.unpark
  1720000000    1.15%      172  sun.nio.ch.FileDispatcherImpl.write0
  1580000000    1.05%      158  java.net.SocketInputStream.socketRead0
  1510000000    1.01%      151  org.jruby.runtime.callsite.CachingCallSite.call
  1490000000    0.99%      149  org.jruby.runtime.callsite.CachingCallSite.call
  1100000000    0.73%      110  java.io.FileInputStream.readBytes
  1080000000    0.72%      108  java.util.HashMap.getNode
  1040000000    0.69%      104  java.lang.invoke.MethodHandle.invokeBasic
   830000000    0.55%       83  java.net.SocketOutputStream.socketWrite0
   700000000    0.47%       70  sun.nio.cs.FastCharsetProvider.canonicalize
   690000000    0.46%       69  org.jruby.RubyModule.searchWithCache
   680000000    0.45%       68  java.lang.invoke.LambdaForm$MH.2011482127.invokeExact_MT
   670000000    0.45%       67  java.lang.invoke.LambdaForm$MH.477289012.invokeExact_MT
   550000000    0.37%       55  org.jruby.ir.runtime.IRRuntimeHelpers.getVariableWithAccessor
   530000000    0.35%       53  org.jruby.RubyHash.internalGetEntry
   430000000    0.29%       43  sun.nio.ch.KQueueArrayWrapper.register0
   400000000    0.27%       40  java.nio.charset.Charset.lookup2
   400000000    0.27%       40  java.security.Provider.getService
   380000000    0.25%       38  org.jruby.internal.runtime.methods.AttrReaderMethod.call
   350000000    0.23%       35  java.lang.invoke.LambdaForm$MH.673068808.invokeExact_MT
   310000000    0.21%       31  org.jruby.RubyModule.hasModuleInHierarchy
   300000000    0.20%       30  java.lang.String.hashCode
   290000000    0.19%       29  org.jruby.internal.runtime.methods.DynamicMethod.call
   290000000    0.19%       29  java.lang.invoke.LambdaForm$MH.532445947.invokeExact_MT
   280000000    0.19%       28  org.jruby.internal.runtime.methods.MixedModeIRMethod.call
   280000000    0.19%       28  itable stub
   270000000    0.18%       27  org.jruby.internal.runtime.methods.MixedModeIRMethod.call
   270000000    0.18%       27  org.jruby.IncludedModuleWrapper.isSame
   270000000    0.18%       27  org.jruby.RubyModule.searchMethodInner
   250000000    0.17%       25  sun.nio.ch.FileDispatcherImpl.read0
   250000000    0.17%       25  jbyte_disjoint_arraycopy
   240000000    0.16%       24  java.lang.System.identityHashCode
   240000000    0.16%       24  itable stub
   240000000    0.16%       24  org.jruby.runtime.callsite.CachingCallSite.call
   220000000    0.15%       22  sun.security.provider.SecureRandom.updateState
   220000000    0.15%       22  org.jruby.internal.runtime.methods.DynamicMethod.call
   200000000    0.13%       20  sun.nio.ch.KQueueArrayWrapper.kevent0
   200000000    0.13%       20  org.jruby.util.ArraySupport.copy
   190000000    0.13%       19  org.jruby.parser.StaticScope.acquireConstructor
   190000000    0.13%       19  java.lang.ThreadLocal$ThreadLocalMap.cleanSomeSlots
   180000000    0.12%       18  org.jruby.runtime.callsite.CachingCallSite.call
   180000000    0.12%       18  java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry
   170000000    0.11%       17  sun.nio.cs.FastCharsetProvider.charsetForName
   170000000    0.11%       17  com.kenai.jffi.Foreign.invokeN2O1
   160000000    0.11%       16  vtable stub
   150000000    0.10%       15  org.jruby.runtime.ivars.VariableAccessorVar1.get
   140000000    0.09%       14  org.jruby.util.ByteList.equal
   140000000    0.09%       14  org.jruby.RubyBasicObject.eql
   140000000    0.09%       14  org.jruby.runtime.Helpers.fillNil
   140000000    0.09%       14  org.jruby.RubyModule.searchWithCache
   140000000    0.09%       14  org.jruby.internal.runtime.methods.DynamicMethod.isCallableFrom
   140000000    0.09%       14  org.jruby.internal.runtime.ThreadService.getCurrentContext
   140000000    0.09%       14  sun.security.provider.NativePRNG$RandomIO.implNextBytes
   130000000    0.09%       13  org.jruby.internal.runtime.methods.MixedModeIRMethod.call
   130000000    0.09%       13  vtable stub
   130000000    0.09%       13  org.jruby.internal.runtime.methods.MixedModeIRMethod.call
   130000000    0.09%       13  org.jruby.internal.runtime.methods.CompiledIRMethod.call
   120000000    0.08%       12  org.jruby.RubySymbol$SymbolTable.getSymbol
   120000000    0.08%       12  java.lang.ref.Finalizer.add
   120000000    0.08%       12  org.jruby.util.ByteList.ensure
   120000000    0.08%       12  sun.security.provider.NativePRNG$RandomIO.access$400
   120000000    0.08%       12  java.util.concurrent.ConcurrentHashMap.transfer
   110000000    0.07%       11  org.jruby.RubyBasicObject.variableTableContains
   110000000    0.07%       11  org.jruby.util.io.EncodingUtils.encCrStrBufCat
   110000000    0.07%       11  org.joni.ByteCodeMachine.executeSb
   110000000    0.07%       11  org.jruby.runtime.ivars.VariableAccessorVar0.get
   100000000    0.07%       10  org.jruby.RubyHash.internalPutNoResize
   100000000    0.07%       10  itable stub
   100000000    0.07%       10  vtable stub
   100000000    0.07%       10  vtable stub
    90000000    0.06%        9  org.jruby.ir.runtime.IRRuntimeHelpers.setVariableWithAccessor
    90000000    0.06%        9  java.lang.invoke.LambdaForm$MH.494951073.invokeExact_MT
    90000000    0.06%        9  org.jruby.util.io.EncodingUtils.encCrStrBufCat
    90000000    0.06%        9  org.jruby.util.StringSupport.codeRangeScanAsciiCompatible
    90000000    0.06%        9  org.jruby.runtime.Block.setEvalType
    80000000    0.05%        8  sun.nio.cs.UTF_8$Encoder.encodeLoop
    80000000    0.05%        8  vtable stub
    80000000    0.05%        8  java.lang.invoke.DirectMethodHandle.allocateInstance
    80000000    0.05%        8  org.jruby.IncludedModule.getNonIncludedClass
    80000000    0.05%        8  org.jruby.RubyArray.eltInternal
    80000000    0.05%        8  org.jruby.runtime.Helpers.invoke
    80000000    0.05%        8  vtable stub
    80000000    0.05%        8  org.jruby.RubyHash.hashValue
    80000000    0.05%        8  vtable stub
    80000000    0.05%        8  org.jruby.runtime.ivars.VariableAccessorVar2.get
    80000000    0.05%        8  org.jruby.runtime.ivars.VariableAccessor.get
    80000000    0.05%        8  org.jruby.internal.runtime.methods.DynamicMethod.call
    80000000    0.05%        8  org.jruby.runtime.ivars.VariableAccessorVar0.setVariable
    80000000    0.05%        8  org.jruby.runtime.callsite.CachingCallSite.cacheAndCall
    70000000    0.05%        7  com.mysql.jdbc.StatementImpl.<init>
    70000000    0.05%        7  org.jruby.RubyHash.internalCopyTable
    70000000    0.05%        7  org.jruby.RubyArray$INVOKER$i$0$0$each_DBG.call
    70000000    0.05%        7  org.jruby.internal.runtime.methods.DynamicMethod.call
    70000000    0.05%        7  org.jruby.runtime.BlockBody.yield
    70000000    0.05%        7  sun.nio.cs.UTF_8$Decoder.decodeArrayLoop
    70000000    0.05%        7  org.jruby.RubyHash$INVOKER$i$default_value_get_DBG.call
    70000000    0.05%        7  org.jruby.util.RegexpSupport.preprocess
    70000000    0.05%        7  org.jruby.specialized.RubyArrayOneObject.eltInternal
    70000000    0.05%        7  org.jruby.runtime.opto.FailoverSwitchPointInvalidator.invalidate
    70000000    0.05%        7  vtable stub
    70000000    0.05%        7  org.jruby.RubyModule.newMethod
    70000000    0.05%        7  org.jruby.runtime.CompiledIRBlockBody.canCallDirect
    60000000    0.04%        6  sun.util.PreHashedMap.get
    60000000    0.04%        6  org.jruby.runtime.ivars.VariableTableManager.syncVariables
    60000000    0.04%        6  vtable stub
    60000000    0.04%        6  org.jruby.Ruby$3.setValue
    60000000    0.04%        6  org.jruby.RubyArray.append
    60000000    0.04%        6  java.lang.ThreadLocal.setInitialValue
    60000000    0.04%        6  jlong_disjoint_arraycopy
    60000000    0.04%        6  org.jruby.RubyArray.eltOk
    60000000    0.04%        6  org.jruby.RubyRegexp$SearchMatchTask.run
    60000000    0.04%        6  vtable stub
    60000000    0.04%        6  org.jruby.util.Sprintf.rubySprintfToBuffer
    60000000    0.04%        6  org.jruby.internal.runtime.methods.MixedModeIRMethod.call
    60000000    0.04%        6  org.jruby.runtime.callsite.CachingCallSite.call
    50000000    0.03%        5  org.jruby.runtime.ivars.StampedVariableAccessor.createTableUnsafe
    50000000    0.03%        5  org.jruby.runtime.ivars.VariableAccessorVar5.get
    50000000    0.03%        5  org.jruby.runtime.Helpers.invoke
    50000000    0.03%        5  org.jruby.runtime.callsite.CachingCallSite.call
    50000000    0.03%        5  java.lang.invoke.LambdaForm$DMH.770754640.invokeStatic_L6_L
    50000000    0.03%        5  org.jruby.ir.runtime.IRRuntimeHelpers.splatArray
    50000000    0.03%        5  vtable stub
    50000000    0.03%        5  org.jruby.ir.runtime.IRRuntimeHelpers.getBlockFromObject
    50000000    0.03%        5  org.jruby.internal.runtime.methods.CompiledIRMethod.call
    50000000    0.03%        5  vtable stub
    50000000    0.03%        5  org.jruby.RubyArray.newArrayMayCopy
    50000000    0.03%        5  org.jruby.Ruby.addModule
    50000000    0.03%        5  org.jruby.runtime.Signature.checkArity
    50000000    0.03%        5  org.jruby.RubyModule$KindOf.isKindOf
    50000000    0.03%        5  org.jruby.runtime.ivars.VariableAccessorVar7.get
    50000000    0.03%        5  org.jruby.Ruby$3.getValue
    50000000    0.03%        5  org.jruby.runtime.Block.call
    50000000    0.03%        5  org.jruby.IncludedModuleWrapper.searchMethodCommon
    50000000    0.03%        5  org.jruby.RubyHash.internalDelete
    50000000    0.03%        5  org.jruby.internal.runtime.methods.DynamicMethod.isNotImplemented
    50000000    0.03%        5  org.jruby.runtime.callsite.CachingCallSite.call
    50000000    0.03%        5  java.lang.invoke.LambdaForm$BMH.1282473384.reinvoke
    50000000    0.03%        5  org.jruby.RubyHash.op_asetForString
    50000000    0.03%        5  org.jruby.runtime.callsite.CachingCallSite.call
    50000000    0.03%        5  org.jruby.RubyArray.flatten
    50000000    0.03%        5  java.lang.invoke.Invokers.checkCustomized
    50000000    0.03%        5  Users.jules.$_dot_rvm.gems.jruby_minus_9_dot_1_dot_17_dot_0.gems.thread_safe_minus_0_dot_3_dot_6_minus_java.lib.thread_safe.cache.RUBY$method$\=\{\}$0
    50000000    0.03%        5  org.jruby.runtime.ivars.VariableAccessorVar3.get
    40000000    0.03%        4  java.lang.invoke.LambdaForm$DMH.258952499.invokeStatic_LL_V
    40000000    0.03%        4  java.lang.invoke.LambdaForm$DMH.963971797.invokeStatic_L6_L
    40000000    0.03%        4  org.jruby.internal.runtime.methods.CompiledIRMethod.call
    40000000    0.03%        4  org.jruby.runtime.ThreadContext.pushCallFrame
    40000000    0.03%        4  org.jruby.RubySymbol.symbolBytesFromString
    40000000    0.03%        4  vtable stub
    40000000    0.03%        4  org.jruby.RubyKernel$INVOKER$s$respond_to_missing_p_DBG.call
    40000000    0.03%        4  org.jruby.runtime.callsite.CachingCallSite.cacheAndCall
    40000000    0.03%        4  org.joni.Matcher.searchInterruptible
    40000000    0.03%        4  org.jruby.runtime.ivars.VariableAccessorVar5.setVariable
    40000000    0.03%        4  org.jruby.runtime.BlockBody.canCallDirect
    40000000    0.03%        4  org.jruby.RubyObject.equalInternal
    40000000    0.03%        4  org.jruby.runtime.ivars.VariableAccessorVar2.set
    40000000    0.03%        4  org.jruby.runtime.ivars.VariableAccessorVar1.setVariable
    40000000    0.03%        4  org.jruby.RubyModule.searchMethodCommon
    40000000    0.03%        4  java.lang.invoke.LambdaForm$DMH.777874839.newInvokeSpecial_LL_L
    40000000    0.03%        4  org.jruby.runtime.ivars.VariableAccessorVar4.get
    40000000    0.03%        4  org.jruby.util.TypeConverter.checkArrayType
    40000000    0.03%        4  org.jruby.RubyEncoding$UTF8Coder.encode
    40000000    0.03%        4  org.jruby.util.RubyDateFormatter.formatToByteList
    40000000    0.03%        4  org.jruby.runtime.ivars.VariableAccessorVar2.setVariable
    40000000    0.03%        4  org.jruby.runtime.callsite.NormalCachingCallSite.methodMissing
    40000000    0.03%        4  java.lang.ref.Reference.tryHandlePending
    40000000    0.03%        4  org.jruby.runtime.callsite.CachingCallSite.call
    40000000    0.03%        4  org.jruby.RubyArray.aryDup
    40000000    0.03%        4  org.jruby.RubySymbol.hashCode
    40000000    0.03%        4  jshort_disjoint_arraycopy
    40000000    0.03%        4  org.jruby.runtime.DynamicScope.getValueOrNil
    40000000    0.03%        4  java.lang.String.regionMatches
    40000000    0.03%        4  java.lang.ref.Finalizer.runFinalizer
    40000000    0.03%        4  java.lang.Thread.interrupted
    40000000    0.03%        4  org.jruby.util.StringSupport.areComparableViaCodeRange
    40000000    0.03%        4  sun.nio.cs.UTF_8$Decoder.decodeLoop
    40000000    0.03%        4  org.jruby.runtime.scopes.DynamicScope1.getValueDepthZero
    40000000    0.03%        4  org.jruby.runtime.ivars.VariableAccessorVar0.set
    40000000    0.03%        4  org.jruby.internal.runtime.methods.AliasMethod.call
    40000000    0.03%        4  org.jruby.RubyBasicObject.getInstanceVariables
    40000000    0.03%        4  Users.jules.$_dot_rvm.gems.jruby_minus_9_dot_1_dot_17_dot_0.gems.activesupport_minus_4_dot_2_dot_11.lib.active_support.per_thread_registry.RUBY$method$instance$0
    40000000    0.03%        4  org.jruby.RubyBasicObject.isTrue
    30000000    0.02%        3  java.lang.invoke.LambdaForm$DMH.1625714195.invokeStatic_L7_L
    30000000    0.02%        3  Users.jules.$_dot_rvm.gems.jruby_minus_9_dot_1_dot_17_dot_0.gems.activerecord_minus_jdbc_minus_adapter_minus_1_dot_3_dot_25.lib.arjdbc.mysql.adapter.RUBY$method$quote_table_name$0
    30000000    0.02%        3  java.lang.ThreadLocal$ThreadLocalMap.remove
    30000000    0.02%        3  org.jruby.Ruby.addToObjectSpace
    30000000    0.02%        3  sun.security.provider.NativePRNG$RandomIO.getMixRandom
    30000000    0.02%        3  org.jruby.RubyModule$INVOKER$i$1$0$op_eqq_DBG.call
    30000000    0.02%        3  org.jruby.RubySymbol$SymbolTable.getSymbol
    30000000    0.02%        3  org.jruby.RubyObject$9.allocate
    30000000    0.02%        3  java.lang.ThreadLocal$ThreadLocalMap.getEntry
    30000000    0.02%        3  org.jruby.RubyRegexp.quote19
    30000000    0.02%        3  org.jruby.RubyModule.findImplementer

If we compare the two traces, here's what we immediately observe (the trace where dd-trace-rb is enabled is the one on the right in this image):

guizmaii · 2018-11-29T13:30:02Z

There's a chance that I've found the culprit !

The line of code responsible:

# Attempt to retrieve the connection from an object ID.
def self.connection_by_id(object_id)
  return nil if object_id.nil?
  ObjectSpace._id2ref(object_id)
rescue StandardError
  nil
end

Here: https://github.com/DataDog/dd-trace-rb/blob/master/lib/ddtrace/contrib/active_record/utils.rb#L29

delner · 2018-11-29T15:59:34Z

@guizmaii Mmmm, interesting. Does this raise a lot of exceptions in JRuby? If so, what kind of exception? Maybe this isn't JRuby compatible and we should either short circuit nil or find an alternative means of making this feature work.

Also, does it raise often (or at all) in MRI?

guizmaii · 2018-11-29T16:17:46Z

Maybe this isn't JRuby compatible

ObjectSpace is deactivate by default in JRuby for performance reasons.
More info: https://github.com/jruby/jruby/wiki/PerformanceTuning#dont-enable-objectspace
So we can't use it with JRuby.

we should either short circuit nil or find an alternative means of making this feature work

I'm currently looking how all of this works by debugging my app.

A thing I observe is that at the app boot, this function is called:

def self.connection_config(object_id = nil)
  object_id.nil? ? default_connection_config : connection_config_by_id(object_id)
end

with object_id equal to nil. So it calls default_connection_config:

def self.default_connection_config
  return @default_connection_config if instance_variable_defined?(:@default_connection_config)
  current_connection_name = if ::ActiveRecord::Base.respond_to?(:connection_specification_name)
                              ::ActiveRecord::Base.connection_specification_name
                            else
                              ::ActiveRecord::Base
                            end

  connection_pool = ::ActiveRecord::Base.connection_handler.retrieve_connection_pool(current_connection_name)
  connection_pool.nil? ? {} : (@default_connection_config = connection_pool.spec.config)
rescue StandardError
  {}
end

And this call works fine. The @default_connection_config instance variable is set to:

{:adapter=>"mysql2", :host=>"127.0.0.1", :database=>"mydb", :username=>"root", :password=>nil}

But then, when an HTTP call arrives, the connection_config function is called again but with 2550 as object_id argument value (which comes from payload[:connection_id] in config = Utils.connection_config(payload[:connection_id]) in lib/ddtrace/contrib/active_record/events/sql.rb)

What is this 2550 ? What is this payload[:connection_id] ??
Can't we reuse the @default_connection_config previously set ?

delner · 2018-11-29T16:36:12Z

Yeah, so you've encountered the challenge with ActiveRecord notifications, haha. This ActiveRecord code that we wrote is attempting to accomplish two major things:

Determine a good service name to assign the ActiveRecord service at boot time (e.g. my-app_postgres)
Resolve the connection parameters at trace time so we can tag the trace with details like host, port, etc.

I have to look at it again, but if I remember correctly, this ObjectSpace call that we were using was because ActiveRecord only gave us object IDs for the connection when the event was raised, not the connection object itself. So in order to get the data we wanted, we had to use ObjectSpace to do this.

If this always fails in JRuby because ObjectSpace isn't supported, then we should probably just detect the platform and return nil. If this exception happens in MRI a lot, or if there's some clever, simple way of getting the connection object in a way that works with Rails 3.0 - 6.0 and MRI/JRuby, then implementing a different strategy that doesn't use ObjectSpace at all would be even better.

guizmaii · 2018-11-29T17:45:16Z

FYI, @headius said this:

_id2ref should never be use by any code anywhere. It isn't even safe on MRI.

Here: https://twitter.com/headius/status/1068188351012966402

Another information, thanks to this ugly fix ee61d24 and some additional tests, I can confirm that it's only the previously cited line of code that is problematic.
Nothing else in the :rails integration creates performance problems.

delner · 2018-11-29T19:09:53Z

Great, this has been really helpful. I'm glad we've been able to narrow this down to something very specific.

Can I suggest you try forking ddtrace and commenting/disabling this part of the code? It would be good to see how much it could improve performance compared to your previous benchmarks.

guizmaii · 2018-11-29T19:16:35Z

Can I suggest you try forking ddtrace and commenting/disabling this part of the code? It would be good to see how much it could improve performance compared to your previous benchmarks.

Already done with the ugly fix I talked about. The performances are great ! Just the same as without dd-trace-rb 🙂

delner · 2018-11-29T19:32:40Z

Awesome! Sorry if I misunderstood that. Sounds like this is something we can/should fix right away then.

guizmaii · 2018-11-29T19:40:52Z

I proposed an evolution to Rails here: rails/rails#34568.

Maybe there's another way to solve that but I'm not an expert in Ruby nor in Rails to know this other way. 😕

delner · 2018-11-29T20:05:27Z

At the very least, if the Rails guys can't/won't make a change like this, I could probably try introducing a monkey-patch that injects this behavior into Rails. That would be very sub-optimal though, given it'd probably be very brittle between Rails versions.

guizmaii · 2018-11-29T20:13:58Z

IMO, Rails 3 and 4 can be monkey-patched without risks because they don't evolve anymore.

For Rails >4, we could use the solution the Rails guys will propose us. Let's see what they propose.

headius · 2018-11-30T08:34:26Z

I wanted to clarify my statement about id2ref above...

The problem with it in MRI is that the object_id is basically just the pointer to the object on the heap. Using a pointer to recover the object can fail spectacularly if the object has been collected and that pointer address now contains a new object. In addition, using object_id makes objects "shady" (I believe...@tenderlove or @ko1 can confirm) which interferes with generational garbage collection (but this is AR's fault for using it, not yours).

The right fix is most definitely to move away from object_id as a key for anything ever and provide a proper weak map or registry for these sorts of things.

And I have no idea what that would require in Rails 4 😀

guizmaii · 2018-11-30T09:34:32Z

Hopefully we'll not need to monkey-patch 🙂

rails/rails#34568 (comment)

guizmaii · 2018-11-30T09:41:55Z

I found a solution ! (See rails/rails#34568 (comment))

I'll implement the fix 😉

guizmaii · 2018-11-30T11:16:41Z

The fix is implemented. See #646.

But tests fail for strange reasons:

  1) Failure:
TracerActiveRecordTest#test_instantiation_tracing [/app/test/contrib/sinatra/tracer_activerecord_test.rb:145]:
--- expected
+++ actual
@@ -1,2 +1,2 @@
 # encoding: UTF-8
-"sinatra"
+"defaultdb"



  2) Failure:
TracerActiveRecordTest#test_request [/app/test/contrib/sinatra/tracer_activerecord_test.rb:86]:
--- expected
+++ actual
@@ -1,2 +1,2 @@
 # encoding: UTF-8
-"sqlite"
+"sinatra"

delner · 2018-11-30T16:27:19Z

@guizmaii I don't think that's going to work as its currently implemented. Unfortunately it's not just a matter of grabbing ::ActiveRecord::Base.connection as that will just return the default connection which could be the wrong one. We need to find the actual connection used (or at the very least know its connection config name) so we can get the config for that specific one, because it could be different than other connections, particularly when there are multiple databases present.

The Active record tests fail accordingly:

root@c40f8d684f44:/app# bundle exec appraisal contrib rake spec:active_record
>> BUNDLE_GEMFILE=/app/gemfiles/contrib.gemfile bundle exec rake spec:active_record
/usr/local/bin/ruby -I/usr/local/bundle/gems/rspec-core-3.8.0/lib:/usr/local/bundle/gems/rspec-support-3.8.0/lib /usr/local/bundle/gems/rspec-core-3.8.0/exe/rspec --pattern spec/ddtrace/contrib/active_record/\*\*/\*_spec.rb
I, [2018-11-30T16:21:27.818323 #14096]  INFO -- : Executing database migrations
-- create_table("articles", {:force=>:cascade})
   -> 0.0293s

Randomized with seed 4428
-- create_table("gadgets", {:force=>:cascade})
   -> 0.0259s
FFFFFFF.

Failures:

  1) ActiveRecord multi-database implementation when databases are configured with a Hash that describes a connection should eq "widget-db"
     Failure/Error: expect(widget_span.service).to eq(widget_db_service_name)
     
       expected: "widget-db"
            got: "default-db"
     
       (compared using ==)
     # ./spec/ddtrace/contrib/active_record/multi_db_spec.rb:198:in `block (4 levels) in <top (required)>'

I would recommend re-opening that Rails PR, since we do need access to the connection object or name, and looping through the list of all connections everytime the user does a query could be a sizeable impact on performance (probably more expensive than what we already have for MRI.)

ko1 · 2018-12-01T05:05:52Z

In addition, using object_id makes objects "shady" (I believe...@tenderlove or @ko1 can confirm) which interferes with generational garbage collection (but this is AR's fault for using it, not yours).

On the current MRI, object_id doesn't affect any data structure. But @tenderlove is making new optimizations (mostly compaction/copying GC) and it change the behavior (object_id will affect performance).

headius · 2018-12-03T09:02:56Z

@ko1 Thank you for clarifying! JRuby also sees little impact but only because we do not normally support _id2ref without a command-line flag. With id2ref enabled, however, the impact is very large. In the future, object_id needs to either go away (best) or become and opaque, idempotent value (very complicated).

In any case...nobody should use object_id or id2ref ever.

guizmaii · 2018-12-03T10:25:04Z

@delner I proposed a new, more conservative, fix here: #647

I still have the same errors in CI that I don't understand:

  1) Failure:
TracerActiveRecordTest#test_instantiation_tracing [/app/test/contrib/sinatra/tracer_activerecord_test.rb:145]:
--- expected
+++ actual
@@ -1,2 +1,2 @@
 # encoding: UTF-8
-"sinatra"
+"sqlite"



  2) Failure:
TracerActiveRecordTest#test_request [/app/test/contrib/sinatra/tracer_activerecord_test.rb:86]:
--- expected
+++ actual
@@ -1,2 +1,2 @@
 # encoding: UTF-8
-"sqlite"
+"sinatra"


18 runs, 140 assertions, 2 failures, 0 errors, 0 skips
```

delner · 2018-12-03T16:27:13Z

@guizmaii Left a comment on your PR for this CI issue.

delner · 2018-12-14T18:40:14Z

@guizmaii I merged #648 and #649 which should address this issue. Can you try the latest on 0.18-dev and see if it resolves your performance issue? If so, let's close this issue and your other PRs.

guizmaii · 2018-12-14T19:03:07Z

Your solution is fun. Way too complex for a Scala dev like me to fully understand but I think I see what you're doing in the #649 patch. 🙂

I just ended something else I was working on this week and it's friday evening here. So, I'll test you patch on monday morning !

Thanks for your help and responsiveness 🙂

@headius does this kind of meta programming has a cost with JRuby ?

delner · 2018-12-14T19:26:38Z

Yeah, its slightly more complicated (not that complexity is ever preferable), but its meant to mimic that Rails PR behavior which adds :connection. It does have an additional cost in the sense that it will bubble through an additional block but the overhead is pretty small, at least for MRI. It did run a little bit faster in MRI than the pool searching from what I could tell in my own performance tests (see the PR I linked.)

Still, I'll be interested to see what kind of improvement you find it makes on your end, and for JRuby.

guizmaii · 2018-12-17T10:25:31Z

@delner Good news here !

I benchmarked the 0.18-dev branch and I can't see any performance difference between an execution with dd-trace-rb enable and an execution dd-trace-rb disable ! 🎉

Your patch works very well sir ! 🙂

Yet, I can propose you a little improvement of your code: #656. WDYT ?

delner · 2018-12-18T00:17:05Z

Merged your PR to 0.18-dev. Thanks @guizmaii! I'm going to release this tomorrow with 0.18.0.

guizmaii · 2018-12-18T13:13:00Z

Great news!

Thanks for all @delner. It was great to work with you! :)

And thanks to everyone how helped us here, especially @headius :)

delner · 2018-12-18T16:48:50Z

Likewise @guizmaii, you were extraordinarily helpful!

guizmaii changed the title ~~High performances overhead on JRuby~~ High performance overhead on JRuby Nov 27, 2018

guizmaii changed the title ~~High performance overhead on JRuby~~ Very high performance overhead on JRuby Nov 27, 2018

delner self-assigned this Nov 27, 2018

delner added core Involves Datadog core libraries community Was opened by a community member labels Nov 27, 2018

guizmaii mentioned this issue Nov 28, 2018

Add a Dockerfile for JRuby 9.1 && 9.2 DataDog/docker-library#20

Closed

guizmaii added a commit to guizmaii/dd-trace-rb that referenced this issue Nov 28, 2018

Test @enebo idea from here: DataDog#640 (comment)

728c0a2

"Not saying it is this but replacing c = caller with c=1 in the gem and rerunning may be a super quick test"

guizmaii added a commit to guizmaii/dd-trace-rb that referenced this issue Nov 29, 2018

Try( Fix DataDog#640: Only use default_connection_config )

ee61d24

guizmaii mentioned this issue Nov 29, 2018

WIP: Fix high performance overhoad on JRuby #645

Closed

guizmaii mentioned this issue Nov 29, 2018

Pass the connection to the @instrumenter.instrument method call rails/rails#34568

Closed

guizmaii mentioned this issue Nov 30, 2018

Fix performance issues with JRuby #646

Closed

guizmaii mentioned this issue Dec 3, 2018

Fix performance issues with JRuby #647

Closed

delner mentioned this issue Dec 4, 2018

Add Datadog::Shim for instrumentation hooks #648

Merged

4 tasks

delner added this to the 0.18.0 milestone Dec 14, 2018

guizmaii closed this as completed Dec 18, 2018

Very high performance overhead on JRuby #640

Very high performance overhead on JRuby #640

Comments

guizmaii commented Nov 27, 2018 • edited Loading

delner commented Nov 27, 2018

guizmaii commented Nov 27, 2018 • edited Loading

headius commented Nov 27, 2018

delner commented Nov 27, 2018

guizmaii commented Nov 27, 2018

enebo commented Nov 27, 2018

guizmaii commented Nov 28, 2018 • edited Loading

delner commented Nov 28, 2018 • edited Loading

guizmaii commented Nov 29, 2018 • edited Loading

guizmaii commented Nov 29, 2018 • edited Loading

delner commented Nov 29, 2018

guizmaii commented Nov 29, 2018 • edited Loading

delner commented Nov 29, 2018 • edited Loading

guizmaii commented Nov 29, 2018 • edited Loading

delner commented Nov 29, 2018

guizmaii commented Nov 29, 2018

delner commented Nov 29, 2018

guizmaii commented Nov 29, 2018 • edited Loading

delner commented Nov 29, 2018

guizmaii commented Nov 29, 2018 • edited Loading

headius commented Nov 30, 2018

guizmaii commented Nov 30, 2018

guizmaii commented Nov 30, 2018 • edited Loading

guizmaii commented Nov 30, 2018

delner commented Nov 30, 2018

ko1 commented Dec 1, 2018 • edited Loading

headius commented Dec 3, 2018

guizmaii commented Dec 3, 2018

delner commented Dec 3, 2018

delner commented Dec 14, 2018 • edited Loading

guizmaii commented Dec 14, 2018 • edited Loading

delner commented Dec 14, 2018

guizmaii commented Dec 17, 2018 • edited Loading

delner commented Dec 18, 2018

guizmaii commented Dec 18, 2018

delner commented Dec 18, 2018

guizmaii commented Nov 27, 2018 •

edited

Loading

guizmaii commented Nov 27, 2018 •

edited

Loading

guizmaii commented Nov 28, 2018 •

edited

Loading

delner commented Nov 28, 2018 •

edited

Loading

guizmaii commented Nov 29, 2018 •

edited

Loading

guizmaii commented Nov 29, 2018 •

edited

Loading

guizmaii commented Nov 29, 2018 •

edited

Loading

delner commented Nov 29, 2018 •

edited

Loading

guizmaii commented Nov 29, 2018 •

edited

Loading

guizmaii commented Nov 29, 2018 •

edited

Loading

guizmaii commented Nov 29, 2018 •

edited

Loading

guizmaii commented Nov 30, 2018 •

edited

Loading

ko1 commented Dec 1, 2018 •

edited

Loading

delner commented Dec 14, 2018 •

edited

Loading

guizmaii commented Dec 14, 2018 •

edited

Loading

guizmaii commented Dec 17, 2018 •

edited

Loading