Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add profiler helper for detecting if we're on the main Ractor #2350

Merged
merged 2 commits into from
Nov 9, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions ext/ddtrace_profiling_native_extension/extconf.rb
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,9 @@ def add_compiler_flag(flag)
# On older Rubies, we need to use a backported version of this function. See private_vm_api_access.h for details.
$defs << '-DUSE_BACKPORTED_RB_PROFILE_FRAME_METHOD_NAME' if RUBY_VERSION < '3'

# On older Rubies, there are no Ractors
$defs << '-DNO_RACTORS' if RUBY_VERSION < '3'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no built in define around this? Interesting...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick search and there isn't any -- the closest I found has HAVE_RUBY_RACTOR_H which can maybe thought of as a way of doing that but... I decided to keep this one for now ;) (Unless you think it's a really bad idea)

I think the reason why there is no define is because you can't disable Ractors (that I know of), and thus you can always probe them by looking at the Ruby version, so that's probably why they don't have a more specific #define.

On our side, as you can kinda tell by this file, I've been somewhat avoiding doing something like $defs << "-DRUBY_#{RUBY_VERSION}" and then doing #ifdef RUBY_3.1 in the code, in favor of being a bit more semantic -- e.g. rather than saying "this is the codepath for Ruby 3.1", say "this is the codepath for when this structure has a member called foo".


# On older Rubies, we need to use rb_thread_t instead of rb_execution_context_t
$defs << '-DUSE_THREAD_INSTEAD_OF_EXECUTION_CONTEXT' if RUBY_VERSION < '2.5'

Expand Down
25 changes: 25 additions & 0 deletions ext/ddtrace_profiling_native_extension/private_vm_api_access.c
Original file line number Diff line number Diff line change
Expand Up @@ -690,3 +690,28 @@ int ruby_thread_has_gvl_p(void) {
return 0;
}
#endif // NO_THREAD_HAS_GVL

#ifndef NO_RACTORS
// This API and definition are exported as a public symbol by the VM BUT the function header is not defined in any public header, so we
// repeat it here to be able to use in our code.
Comment on lines +695 to +696
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to request/perform this change upstream?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this "public-ish" is actually on purpose. That is, the missing header is not a "oops we forgot", it's a "this isn't really a public API, but some things outside of Ruby make good use of it, and we don't mind exposing it for now but you're on your own".

For instance in ruby/ruby@15476c6 you see a similar API that was published so that fiddle could use it:

https://github.com/ruby/ruby/blob/c3de7a3c58bf9a138ff8720ed56c0045d2b8e01d/internal/thread.h#L48

(Out of curiosity, there was some openness during a few chats with core team members recently to perhaps promote a few more APIs that would be really useful for profilers to this "public-ish" state).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But... maybe I'm just assuming too much. I've been hesitating a bit on asking for these kinds of things from upstream because I wanted to first come up with our most important asks and then go down the list from there, rather than asking for all of them.

bool rb_ractor_main_p_(void);
extern struct rb_ractor_struct *ruby_single_main_ractor;

// Taken from upstream ractor_core.h at commit d9cf0388599a3234b9f3c06ddd006cd59a58ab8b (November 2022, Ruby 3.2 trunk)
// to allow us to ensure that we're always operating on the main ractor (if Ruby has ractors)
// Modifications:
// * None
bool ddtrace_rb_ractor_main_p(void)
{
if (ruby_single_main_ractor) {
return true;
}
else {
return rb_ractor_main_p_();
}
}
#else
// Simplify callers on older Rubies, instead of having them probe if the VM supports Ractors we just tell them that yes
// they're always on the main Ractor
bool ddtrace_rb_ractor_main_p(void) { return true; }
#endif // NO_RACTORS
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ bool is_thread_alive(VALUE thread);
VALUE thread_name_for(VALUE thread);

int ddtrace_rb_profile_frames(VALUE thread, int start, int limit, VALUE *buff, int *lines, bool* is_ruby_frame);
// Returns true if the current thread belongs to the main Ractor or if Ruby has no Ractor support
bool ddtrace_rb_ractor_main_p(void);

// Ruby 3.0 finally added support for showing CFUNC frames (frames for methods written using native code)
// in stack traces gathered via `rb_profile_frames` (https://github.com/ruby/ruby/pull/3299).
Expand Down
10 changes: 10 additions & 0 deletions ext/ddtrace_profiling_native_extension/profiling.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

#include "clock_id.h"
#include "helpers.h"
#include "private_vm_api_access.h"

// Each class/module here is implemented in their separate file
void collectors_cpu_and_wall_time_init(VALUE profiling_module);
Expand All @@ -11,6 +12,7 @@ void http_transport_init(VALUE profiling_module);
void stack_recorder_init(VALUE profiling_module);

static VALUE native_working_p(VALUE self);
static VALUE _native_ddtrace_rb_ractor_main_p(DDTRACE_UNUSED VALUE _self);

void DDTRACE_EXPORT Init_ddtrace_profiling_native_extension(void) {
VALUE datadog_module = rb_define_module("Datadog");
Expand All @@ -27,10 +29,18 @@ void DDTRACE_EXPORT Init_ddtrace_profiling_native_extension(void) {
collectors_stack_init(profiling_module);
http_transport_init(profiling_module);
stack_recorder_init(profiling_module);

// Hosts methods used for testing the native code using RSpec
VALUE testing_module = rb_define_module_under(native_extension_module, "Testing");
rb_define_singleton_method(testing_module, "_native_ddtrace_rb_ractor_main_p", _native_ddtrace_rb_ractor_main_p, 0);
}

static VALUE native_working_p(DDTRACE_UNUSED VALUE _self) {
self_test_clock_id();

return Qtrue;
}

static VALUE _native_ddtrace_rb_ractor_main_p(DDTRACE_UNUSED VALUE _self) {
return ddtrace_rb_ractor_main_p() ? Qtrue : Qfalse;
}
39 changes: 38 additions & 1 deletion spec/datadog/profiling/native_extension_spec.rb
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# typed: false
# typed: ignore

require 'datadog/profiling/spec_helper'

Expand Down Expand Up @@ -160,4 +160,41 @@ def wait_for_thread_to_die
end
end
end

describe 'ddtrace_rb_ractor_main_p' do
subject(:ddtrace_rb_ractor_main_p) { Datadog::Profiling::NativeExtension::Testing._native_ddtrace_rb_ractor_main_p }

context 'when Ruby has no support for Ractors' do
before { skip 'Behavior does not apply to current Ruby version' if RUBY_VERSION >= '3' }

it { is_expected.to be true }
end

context 'when Ruby has support for Ractors' do
before { skip 'Behavior does not apply to current Ruby version' if RUBY_VERSION < '3' }

context 'on the main Ractor' do
it { is_expected.to be true }
end

context 'on a background Ractor' do
# @ivoanjo: When we initially added this test, our test suite kept deadlocking in CI in a later test (not on
# this one).
#
# It turns out that Ruby 3.0 Ractors seem to have some bug that even running `Ractor.new { 'hello' }.take` will
# cause a later spec to fail, usually with a (native C) stack with `gc_finalize_deferred`.
#
# I was able to see this even on both Linux with 3.0.3 and macOS with 3.0.4. Thus, I decided to skip this
# spec on Ruby 3.0. We can always run it manually if we change something around this helper; and we have
# coverage on 3.1+ anyway.
before { skip 'Ruby 3.0 Ractors are too buggy to run this spec' if RUBY_VERSION.start_with?('3.0.') }

subject(:ddtrace_rb_ractor_main_p) do
Ractor.new { Datadog::Profiling::NativeExtension::Testing._native_ddtrace_rb_ractor_main_p }.take
end

it { is_expected.to be false }
end
end
end
end