-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[prof-heap] Misc improvements/fixes #3360
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #3360 +/- ##
=======================================
Coverage 98.23% 98.24%
=======================================
Files 1254 1254
Lines 73217 73287 +70
Branches 3430 3432 +2
=======================================
+ Hits 71928 72000 +72
+ Misses 1289 1287 -2 ☔ View full report in Codecov by Sentry. |
9806fcd
to
bd49938
Compare
a272aa5
to
1da1640
Compare
# would be invisible to free tracepoints, finalizers and without cleaning | ||
# obj_to_id_tbl mappings. This leads to potential invisible object re-use | ||
# to our heap profiler. | ||
$defs << '-DHAVE_WORKING_RB_GC_FORCE_RECYCLE' if RUBY_VERSION < '3.1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So re: yesterday's discussion of how we may be able to workaround the rb_gc_force_recycle
, I think our intuition was correct (and re ruby/ruby#4363 ).
Here's a testcase (thanks ChatGPT for pairing with me on this):
--- a/ext/ddtrace_profiling_native_extension/profiling.c
+++ b/ext/ddtrace_profiling_native_extension/profiling.c
@@ -32,6 +32,7 @@ static void holding_the_gvl_signal_handler(DDTRACE_UNUSED int _signal, DDTRACE_U
static VALUE _native_trigger_holding_the_gvl_signal_handler_on(DDTRACE_UNUSED VALUE _self, VALUE background_thread);
static VALUE _native_enforce_success(DDTRACE_UNUSED VALUE _self, VALUE syserr_errno, VALUE with_gvl);
static void *trigger_enforce_success(void *trigger_args);
+static VALUE _native_force_recycle_test(DDTRACE_UNUSED VALUE _self);
void DDTRACE_EXPORT Init_ddtrace_profiling_native_extension(void) {
VALUE datadog_module = rb_define_module("Datadog");
@@ -65,6 +66,8 @@ void DDTRACE_EXPORT Init_ddtrace_profiling_native_extension(void) {
rb_define_singleton_method(testing_module, "_native_install_holding_the_gvl_signal_handler", _native_install_holding_the_gvl_signal_handler, 0);
rb_define_singleton_method(testing_module, "_native_trigger_holding_the_gvl_signal_handler_on", _native_trigger_holding_the_gvl_signal_handler_on, 1);
rb_define_singleton_method(testing_module, "_native_enforce_success", _native_enforce_success, 2);
+
+ rb_define_singleton_method(testing_module, "_native_force_recycle_test", _native_force_recycle_test, 0);
}
static VALUE _native_force_recycle_test(DDTRACE_UNUSED VALUE _self) {
int number_of_objs = 6000;
VALUE ruby_array = rb_ary_new_capa(number_of_objs);
VALUE object_class = rb_const_get(rb_cObject, rb_intern("Object"));
VALUE object_ids[number_of_objs];
VALUE object_ids_after_gc[number_of_objs];
for (int i = 0; i < number_of_objs; i++) {
VALUE ruby_object = rb_class_new_instance(0, NULL, object_class);
rb_ary_store(ruby_array, i, ruby_object);
object_ids[i] = rb_obj_id(ruby_object);
}
for (int i = 0; i < RARRAY_LEN(ruby_array); i++) {
VALUE ruby_object = rb_ary_entry(ruby_array, i);
rb_ary_store(ruby_array, i, Qnil);
rb_gc_force_recycle(ruby_object);
}
rb_gc_start();
for (int i = 0; i < RARRAY_LEN(ruby_array); i++) {
VALUE new_ruby_object = rb_class_new_instance(0, NULL, object_class);
if (FL_TEST(new_ruby_object, FL_SEEN_OBJ_ID)) {
printf("New object at index %d has FL_SEEN_OBJ_ID flag set after creation\n", i);
}
rb_ary_store(ruby_array, i, new_ruby_object);
object_ids_after_gc[i] = rb_obj_id(new_ruby_object);
if (!FL_TEST(new_ruby_object, FL_SEEN_OBJ_ID)) {
printf("New object at index %d is missing FL_SEEN_OBJ_ID after rb_obj_id call\n", i);
}
}
for (int i = 0; i < number_of_objs; i++) {
for (int j = 0; j < number_of_objs; j++) {
if (object_ids[i] == object_ids_after_gc[j]) {
printf("Object ID %ld is repeated after garbage collection (Indexes: %d, %d)\n", object_ids[i], i, j);
}
}
}
return Qtrue;
}
and here's the result I get on Ruby 2.7:
New object at index 5352 is missing FL_SEEN_OBJ_ID after rb_obj_id call
New object at index 5353 is missing FL_SEEN_OBJ_ID after rb_obj_id call
New object at index 5354 is missing FL_SEEN_OBJ_ID after rb_obj_id call
New object at index 5355 is missing FL_SEEN_OBJ_ID after rb_obj_id call
New object at index 5356 is missing FL_SEEN_OBJ_ID after rb_obj_id call
# ...
New object at index 5995 is missing FL_SEEN_OBJ_ID after rb_obj_id call
New object at index 5996 is missing FL_SEEN_OBJ_ID after rb_obj_id call
New object at index 5997 is missing FL_SEEN_OBJ_ID after rb_obj_id call
New object at index 5998 is missing FL_SEEN_OBJ_ID after rb_obj_id call
New object at index 5999 is missing FL_SEEN_OBJ_ID after rb_obj_id call
Object ID 219601 is repeated after garbage collection (Indexes: 5380, 5999)
Object ID 219641 is repeated after garbage collection (Indexes: 5381, 5998)
Object ID 219681 is repeated after garbage collection (Indexes: 5382, 5997)
Object ID 219721 is repeated after garbage collection (Indexes: 5383, 5996)
Object ID 219761 is repeated after garbage collection (Indexes: 5384, 5995)
# ...
Object ID 244201 is repeated after garbage collection (Indexes: 5995, 5356)
Object ID 244241 is repeated after garbage collection (Indexes: 5996, 5355)
Object ID 244281 is repeated after garbage collection (Indexes: 5997, 5354)
Object ID 244321 is repeated after garbage collection (Indexes: 5998, 5353)
Object ID 244361 is repeated after garbage collection (Indexes: 5999, 5352)
...which seems to match with our expectations:
- Object ids get repeated for objects that get GC'd with
rb_gc_force_recycle
- New objects born in their place do not have the
FL_SEEN_OBJ_ID
set - Buggy objects never get the
FL_SEEN_OBJ_ID
set after a call torb_obj_id
1da1640
to
d62e88b
Compare
d62e88b
to
a4c19cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
I think we're still missing a few details stemming from the discussion around rb_gc_force_recycle
:
- Detecting when a slot got reused by recycle via
st_object_record_update
and stop tracking it - Maybe raising when we see a duplicate object id that doesn't seem related to the
rb_gc_force_recycle
bug - Maybe documenting the potential
FL_SEEN_OBJ_ID
workaround of setting it back when we find it missing?
...but my suggestion would be to close up this PR and do those changes in a follow-up PR.
static ID _inspect_id = Qnil; | ||
static ID _to_s_id = Qnil; | ||
|
||
void ruby_helpers_init(void) { | ||
rb_global_variable(&module_object_space); | ||
|
||
module_object_space = rb_const_get(rb_cObject, rb_intern("ObjectSpace")); | ||
_id2ref_id = rb_intern("_id2ref"); | ||
_inspect_id = rb_intern("inspect"); | ||
_to_s_id = rb_intern("to_s"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Wait, was the _
above a convention? I thought it was because the symbol actually started with _
👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙈
[prof-heap] Fix types
1a4ca3b
to
d6184e2
Compare
Things related to force_recycle moved to #3366 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 LGTM
Co-authored-by: Ivo Anjo <ivo.anjo@datadoghq.com>
Co-authored-by: Ivo Anjo <ivo.anjo@datadoghq.com>
Co-authored-by: Ivo Anjo <ivo.anjo@datadoghq.com>
What does this PR do?
DD_PROFILING_EXPERIMENTAL_HEAP_SIZE_ENABLED
) and heap sample rate (DD_PROFILING_EXPERIMENTAL_HEAP_SAMPLE_RATE
).Motivation:
Additional Notes:
How to test the change?
For Datadog employees:
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance
.Unsure? Have a question? Request a review!