-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable GC profiling on Ruby 3 by default due to Ruby VM Ractor-related bug #2354
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…argument In Ruby, there's no common superclass for `TrueClass` and `FalseClass`, and thus there's no `T_BOOL` or `T_BOOLEAN` we can use for `ENFORCE_TYPE`. Thus, I've added a specific `ENFORCE_BOOLEAN` helper to support this case. (Also removed the `_type` argument which never ended up being used in `raise_unexpected_type`.)
…ed bug **What does this PR do?**: This PR: * Adds a toggle that can be used to enable/disable the GC profiling feature added in #2336 * Enables the GC profiling feature by default on Ruby 2.x and disables the feature by default on Ruby 3.x * Exposes a new `DD_PROFILING_FORCE_ENABLE_GC` / `settings.profiling.advanced.force_enable_gc_profiling` setting to allow customers that don't use Ractors to manually re-enable the GC profiling feature. **Motivation**: While validating that the new profiler correctly works with Ractors, I discovered that adding specs that invoked Ractors caused the previously-built GC profiling feature (and attached specs) to start failing. I was able to track this down to an actual VM bug, which I've reported upstream as <https://bugs.ruby-lang.org/issues/19112>. To avoid impacting any customers (specifically, they would potentially observe profiles with incomplete data), I decided to disable GC profiling for Ruby 3.x customers for now. I am hoping that the Ruby core team can be convinced to fix this issue and backport it to existing Ruby 3.x stable releases, so that we can again re-enable this feature for these users. As an alternative, if this issue sticks around for a long time and we really want this feature to be on by default for Ruby 3.x customers, there are probably are ways for us to automatically detect if Ractors are in use or not. I chose not to pursue such a mechanism for now. **Additional Notes**: (None) **How to test the change?**: The change includes test coverage. You can also observe this feature by profiling a Ruby application: * On Ruby 2.x you will see garbage collection frames * On Ruby 3.x you will not see garbage collection frames unless you force enable the feature.
github-actions
bot
added
core
Involves Datadog core libraries
profiling
Involves Datadog profiling
labels
Nov 8, 2022
TonyCTHsu
approved these changes
Nov 8, 2022
ivoanjo
commented
Nov 8, 2022
Comment on lines
-131
to
+136
rb_define_singleton_method(collectors_cpu_and_wall_time_worker_class, "_native_initialize", _native_initialize, 2); | ||
rb_define_singleton_method(collectors_cpu_and_wall_time_worker_class, "_native_initialize", _native_initialize, 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, the 2
to 3
change here is the number of positional arguments to the method. This is documented here.
Effectively there’s three ways to “wire up” a C function to be called as a Ruby method:
- if you pass in a number >= 0 then that’s the number of positional arguments to the method. E.g.
def foo(a, b, c)
=>rb_define_method(..., foo_implementation, 3)
andfoo_implementation
is expected to have a signature that receives self + 3 other arguments, e.g.VALUE foo_implementation(VALUE self, VALUE a, VALUE b, VALUE c)
. - if you pass in
-1
then Ruby will pass the arguments as a C array - if you pas in
-2
then Ruby will pass in the arguments as a Ruby array (e.g. it’s the C equivalent ofdef foo(*args)
)
I moved it from 2
to 3
since I was passing the new gc_profiling_enabled
value
ivoanjo
added a commit
that referenced
this pull request
Dec 8, 2022
…to overhead **What does this PR do?**: We previously only enabled GC profiling by default on Ruby 2.x (we disabled it on 3.x due to a bug, see #2354). This PR changes the profiling configuration to never enable GC profiling by default (thus bringing the Ruby 2.x and 3.x defaults into sync). **Motivation**: I've had one new Ruby profiler alpha user report that GC profiling had a quite big overhead, and I could replicate the issue on a GC-heavy benchmark. Since we'd like to get the new Ruby profiler out into customer's hands in beta, I decided to disable this feature by default, and come back to it after beta is out. **Additional Notes**: GC profiling can still be enabled via configuration. **How to test the change?**: Change includes test coverage. Furthermore, profiling any application will no longer show the `Garbage Collection` frames in the flamegraph by default.
ivoanjo
added a commit
that referenced
this pull request
Jan 11, 2024
**What does this PR do?** This PR re-enables a garbage collection profiling spec in the `CpuAndWallTimeWorker` that was disabled on Ruby 3. This test was disabled in #2354 as this caused flaky tests due to a Ruby VM bug related to Ractor GC. Recently, this "flaky when combined with Ractors" behavior started showing up in other places and @AlexJF came up with a better solution in #3320 of separating out the Ractor tests. Thus, we can now run this spec on Ruby 3. **Motivation**: Recently, while working on #3356, I noticed that if I commented out the calls to `rb_postponed_job_[trigger|register_one]` in `on_gc_event`, no specs failed. This was pretty surprising to me -- as it was weird that we didn't have test coverage for it. After looking into it, I realized we did have specs for it -- but we weren't running them on Ruby 3, which I was using for local development. The re-enabled spec correctly covers that situation. **Additional Notes:** N/A **How to test the change?** Validate that test now runs on Ruby 3 and test suite is still green.
2 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?:
This PR:
DD_PROFILING_FORCE_ENABLE_GC
/settings.profiling.advanced.force_enable_gc_profiling
setting to allow customers that don't use Ractors to manually re-enable the GC profiling feature.Motivation:
While validating that the new profiler correctly works with Ractors, I discovered that adding specs that invoked Ractors caused the previously-built GC profiling feature (and attached specs) to start failing.
I was able to track this down to an actual VM bug, which I've reported upstream as https://bugs.ruby-lang.org/issues/19112.
To avoid impacting any customers (specifically, they would potentially observe profiles with incomplete data), I decided to disable GC profiling for Ruby 3.x customers for now.
I am hoping that the Ruby core team can be convinced to fix this issue and backport it to existing Ruby 3.x stable releases, so that we can again re-enable this feature for these users.
As an alternative, if this issue sticks around for a long time and we really want this feature to be on by default for Ruby 3.x customers, there are probably are ways for us to automatically detect if Ractors are in use or not. I chose not to pursue such a mechanism for now.
Additional Notes:
(None)
How to test the change?:
The change includes test coverage. You can also observe this feature by profiling a Ruby application: