[PROF-8543] Ruby heap profiling PoC #3246

AlexJF · 2023-11-07T17:52:39Z

What does this PR do?
A PoC for having live heap object counting in the Ruby profiler.

Motivation:
We want to add heap profiling capabilities to the Ruby profiler.

Additional Notes:

Lacks tests.
Written by someone that hasn't touched C in a long time.
Collected heap profiles are not aligned or synced with any GC data so they contain a lot of "noise" from objects that will go away in the next GC. This is something we plan to improve upon.

How to test the change?

Grab/write any Ruby app
Configure it to use dd-trace-rb from this PR
1. Checkout the branch locally
2. Add gem "ddtrace", :path => "<path to checkedout dd-trace-rb> to the Gemfile of the chosen app

Run it with ddtracerb and the right envs like:

export DD_PROFILING_ENABLED="true"
export DD_PROFILING_EXPERIMENTAL_ALLOCATION_ENABLED="true"
export DD_PROFILING_EXPERIMENTAL_HEAP_ENABLED="true"
bundle exec ddtracerb exec ruby <path to main.rb>

Tested this locally along with backend and frontend PRs to support this end-to-end:

For Datadog employees:

If this PR touches code that signs or publishes builds or packages, or handles
credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

ivoanjo

So... I'm still wrapping my head around the heap recorder, but here's a bunch of comments before I head out to lunch ;) (I fully reviewed other files, other than the heap_recorder)

benchmarks/profiler_sample_loop_v2.rb

ivoanjo · 2023-11-08T10:15:00Z

lib/datadog/core/configuration/settings.rb

            option :allocation_counting_enabled do |o|
+              o.type :bool
+              o.env 'DD_PROFILING_EXPERIMENTAL_ALLOCATION_ENABLED'
              o.default do
                RUBY_VERSION.start_with?('2.') ||
                  (RUBY_VERSION.start_with?('3.1.') && RUBY_VERSION >= '3.1.4') ||


I was actually thinking of separating allocation counting (this current setting) from allocation profiling (and introducing an experimental_allocation_enabled). In particular, since counting is pretty simple and even enabled by default in a bunch of situations nowadays.

One advantage of separating them would be to avoid overriding the version checks above accidentally. Alternatively, we could move the version checks to profiling/component.rb and emit warnings about them -- e.g. "you're enabling this feature on a Ruby known to have bugs, be careful" kinda thing.

Done in latest commit 👍

lib/datadog/core/configuration/settings.rb

lib/datadog/profiling/component.rb

ext/ddtrace_profiling_native_extension/heap_recorder.c

ivoanjo · 2023-11-08T12:26:39Z

ext/ddtrace_profiling_native_extension/heap_recorder.c

+  // not end up with a still active recording. new_obj still holds the object for this recording
+  active_recording->obj = 0;
+
+  heap_stack *heap_stack = heap_stack_init(locations);


I guess not for V1, but I'm thinking we can probably dedupe the locations at this step, even without acquiring the lock?

That is, if the lock in practice is protecting the reader (which is flushing the profile) from the writer (which is taking a sample), then it's ok if the writer accesses the hashtables, as long as it doesn't write to it.

(Of course we'd probably need to document that very clearly -- that on the flush path we would not be allowed to mutate the hashtables)

Yeah I want to rewrite things slightly to handle

dd-trace-rb/ext/ddtrace_profiling_native_extension/heap_recorder.c

Lines 187 to 190 in 34ed3f0

} else {

// FIXME: Figure out a way to not have to instantiate a new stack only to free it if it's

// already sampled. Something like supporting indexing the heap_records table with

// ddog_prof_Slice_Location objects directly for instance.

which would allow us to more easily arrive at something that accomplishes what you describe.

ext/ddtrace_profiling_native_extension/heap_recorder.c

ivoanjo · 2023-11-08T12:40:13Z

ext/ddtrace_profiling_native_extension/heap_recorder.c

+    frame->name = ruby_strdup(location->function.name.ptr);
+    frame->filename = ruby_strdup(location->function.filename.ptr);


There's a couple of assumptions here that I'm not 100% sure about:

That ptr is always a null-terminated string: I think so... but... technically... we're getting this from a Ruby string, and I think it's possible for Ruby strings to not be null-terminated? Or at least I'm not entirely confident.

That there's always a value in the ptr

When we feed the locations to libdatadog, we give it a pointer and a length (which I'm not sure is always non-zero) so neither of those problems would occur, but here since we're just using the pointers without the length I'm a bit more unsure.

(Note that it doesn't mean we need to change stuff here -- maybe we just test these things when building the locations so we don't have to care here)

👍

Adding a ruby_strndup helper to deal with this for now.

ivoanjo

Left a few more notes! :)

ivoanjo · 2023-11-08T17:00:51Z

ext/ddtrace_profiling_native_extension/heap_recorder.c

+typedef struct {
+  VALUE obj;
+  unsigned int weight;
+  ddog_CharSlice *class_name;


We're not using the class name yet, but do note this pointer is pointing at something that's stack-allocated, so it's only valid while taking the sample (aka it'll need to be copied if we want to keep it for later).

Removed it from here for now. There was an implicit assumption here that between start_heap_allocation_recording and end_heap_allocation_recording any passed parameters would remain alive but I'll tackle this differently later.

ivoanjo · 2023-11-08T17:07:16Z

ext/ddtrace_profiling_native_extension/heap_recorder.c

+  if (!st_lookup(heap_recorder->heap_records, (st_data_t) heap_stack, (st_data_t*) &heap_record)) {
+    heap_record = heap_record_init(heap_stack);
+    if (st_insert(heap_recorder->heap_records, (st_data_t) heap_stack, (st_data_t) heap_record)) {
+      rb_raise(rb_eRuntimeError, "Duplicate heap stack tracking: %p", heap_stack);
+      return;


I'm curious, when would this happen?

I think only if hash+cmp functions had a bug. But my time programming in Go left me with an innate fear of ignoring return codes therefore the explicit handling of this apparent impossibility

Ah, this is one of those throw "Should never happen" that in a few months actually shows up in prod 🤣

ivoanjo · 2023-11-08T17:08:29Z

ext/ddtrace_profiling_native_extension/heap_recorder.c

+int heap_stack_cmp(heap_stack *st1, heap_stack *st2) {
+  if (st1->frames_len != st2->frames_len) {
+    return (int) (st1->frames_len - st2->frames_len);
+  }
+  for (uint64_t i = 0; i < st1->frames_len; i++) {
+    int cmp = heap_frame_cmp(&st1->frames[i], &st2->frames[i]);
+    if (cmp != 0) {
+      return cmp;
+    }
+  }
+  return 0;
+}
+
+int heap_stack_cmp_st(st_data_t key1, st_data_t key2) {
+  return heap_stack_cmp((heap_stack *) key1, (heap_stack *) key2);
+}
+
+st_index_t heap_stack_hash(heap_stack *stack, st_index_t seed) {
+  st_index_t hash = seed;
+  for (uint64_t i = 0; i < stack->frames_len; i++) {
+    hash = heap_frame_hash(&stack->frames[i], hash);
+  }
+  return hash;
+}
+
+st_index_t heap_stack_hash_st(st_data_t key) {
+  return heap_stack_hash((heap_stack *) key, FNV1_32A_INIT);
+}


Minor: I wonder if we should keep the hash somewhere, to avoid recomputing over and over (as it may get a bit expensive)

The ruby hash table implementation already keeps this around to avoid recomputations on existing items: https://github.com/ruby/ruby/blob/903b0931a116691386e71fa30fb1698bbd785853/st.c#L133-L137

So this is only called on an external key (which is used on external to st.c lookup/insertion/update/deletion) and these keys would almost always be new objects and thus not have hashes precomputed anyway so I think this is fine?

Seems reasonable (and neat that this is done for us!) -- TIL!

ivoanjo · 2023-11-08T17:11:35Z

ext/ddtrace_profiling_native_extension/heap_recorder.c

+  if (st_insert(heap_recorder->object_records, (st_data_t) obj, (st_data_t) object_record) != 0) {
+    // Object already tracked?
+    object_record_free(object_record);
+    rb_raise(rb_eRuntimeError, "Duplicate heap object tracking: %lu", obj);
+    return;
+  }


If my suspicion that the free tracepoint may be skipped sometimes, we may see this one happening. 🤔

ext/ddtrace_profiling_native_extension/heap_recorder.c

ivoanjo · 2023-11-08T17:17:08Z

ext/ddtrace_profiling_native_extension/heap_recorder.c

+  fprintf(stderr, "Enqueuing sample for %lu (weight=%u free=%i)\n", new_sample.obj, new_sample.weight, new_sample.free);
+  if (heap_recorder->queued_samples_len >= MAX_QUEUE_LIMIT) {
+    fprintf(stderr, "Dropping sample on the floor.\n");
+    return;
+  }


This may be problematic if we're dropping a free. We could prioritize frees here (e.g. clobber allocations in the buffer), but we'd always hit a limit still.

ext/ddtrace_profiling_native_extension/heap_recorder.c

ivoanjo · 2023-11-08T17:23:26Z

ext/ddtrace_profiling_native_extension/heap_recorder.c

+int heap_frame_cmp(heap_frame *f1, heap_frame *f2) {
+  int cmp = strcmp(f1->name, f2->name);
+  if (cmp != 0) {
+    return cmp;
+  }
+  cmp = strcmp(f1->filename, f2->filename);
+  if (cmp != 0) {
+    return cmp;
+  }
+  return (int) (f1->line - f2->line);
+}


Minor: May be worth starting with the line first, since it's the cheapest check.

AlexJF · 2023-12-13T13:04:31Z

Replaced with more production-ready versions starting from #3281

Setup clangd-based native dev env via rake native_dev:setup

b18d2f8

github-actions bot added core Involves Datadog core libraries profiling Involves Datadog profiling labels Nov 7, 2023

[PROF-8543] Ruby heap profiling PoC

6f36290

AlexJF force-pushed the alexjf/prof-8543-ruby-heap-profiling-poc branch from d442e2b to 6f36290 Compare November 7, 2023 18:00

[PROF-8543] Sprinkle some fprintfs for CI test debugging

646dd5d

ivoanjo reviewed Nov 8, 2023

View reviewed changes

AlexJF mentioned this pull request Nov 17, 2023

[PROF-8632] Ruby heap size profiling PoC #3261

Closed

2 tasks

Base automatically changed from alexjf/ruby-native-dev-stuff to master November 20, 2023 10:51

Merge branch 'master' into alexjf/prof-8543-ruby-heap-profiling-poc

cd20baf

AlexJF force-pushed the alexjf/prof-8543-ruby-heap-profiling-poc branch from 34ed3f0 to 2a11d34 Compare November 21, 2023 16:19

[PROF-8543] Address comments

564543c

AlexJF force-pushed the alexjf/prof-8543-ruby-heap-profiling-poc branch from 2a11d34 to 564543c Compare November 21, 2023 16:29

AlexJF closed this Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROF-8543] Ruby heap profiling PoC #3246

[PROF-8543] Ruby heap profiling PoC #3246

AlexJF commented Nov 7, 2023 •

edited

Loading

ivoanjo left a comment

ivoanjo Nov 8, 2023

AlexJF Nov 21, 2023

ivoanjo Nov 8, 2023

AlexJF Nov 21, 2023

ivoanjo Nov 8, 2023

AlexJF Nov 21, 2023

ivoanjo left a comment

ivoanjo Nov 8, 2023

AlexJF Nov 21, 2023

ivoanjo Nov 8, 2023

AlexJF Nov 21, 2023

ivoanjo Nov 21, 2023

ivoanjo Nov 8, 2023

AlexJF Nov 21, 2023

ivoanjo Nov 21, 2023 •

edited

Loading

ivoanjo Nov 8, 2023

ivoanjo Nov 8, 2023

ivoanjo Nov 8, 2023

AlexJF commented Dec 13, 2023 •

edited

Loading

	} else {
	// FIXME: Figure out a way to not have to instantiate a new stack only to free it if it's
	// already sampled. Something like supporting indexing the heap_records table with
	// ddog_prof_Slice_Location objects directly for instance.

		frame->name = ruby_strdup(location->function.name.ptr);
		frame->filename = ruby_strdup(location->function.filename.ptr);

[PROF-8543] Ruby heap profiling PoC #3246

[PROF-8543] Ruby heap profiling PoC #3246

Conversation

AlexJF commented Nov 7, 2023 • edited Loading

ivoanjo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ivoanjo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ivoanjo Nov 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexJF commented Dec 13, 2023 • edited Loading

AlexJF commented Nov 7, 2023 •

edited

Loading

ivoanjo Nov 21, 2023 •

edited

Loading

AlexJF commented Dec 13, 2023 •

edited

Loading