Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROF-8543] Ruby heap profiling PoC #3246

Closed
wants to merge 5 commits into from

Conversation

AlexJF
Copy link
Contributor

@AlexJF AlexJF commented Nov 7, 2023

What does this PR do?
A PoC for having live heap object counting in the Ruby profiler.

Motivation:
We want to add heap profiling capabilities to the Ruby profiler.

Additional Notes:

  • Lacks tests.
  • Written by someone that hasn't touched C in a long time.
  • Collected heap profiles are not aligned or synced with any GC data so they contain a lot of "noise" from objects that will go away in the next GC. This is something we plan to improve upon.

How to test the change?

  1. Grab/write any Ruby app
  2. Configure it to use dd-trace-rb from this PR
    1. Checkout the branch locally
    2. Add gem "ddtrace", :path => "<path to checkedout dd-trace-rb> to the Gemfile of the chosen app
  3. Run it with ddtracerb and the right envs like:
    export DD_PROFILING_ENABLED="true"
    export DD_PROFILING_EXPERIMENTAL_ALLOCATION_ENABLED="true"
    export DD_PROFILING_EXPERIMENTAL_HEAP_ENABLED="true"
    bundle exec ddtracerb exec ruby <path to main.rb>
    

Tested this locally along with backend and frontend PRs to support this end-to-end:
image

For Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles
    credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

@github-actions github-actions bot added core Involves Datadog core libraries profiling Involves Datadog profiling labels Nov 7, 2023
@AlexJF AlexJF force-pushed the alexjf/prof-8543-ruby-heap-profiling-poc branch from d442e2b to 6f36290 Compare November 7, 2023 18:00
Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... I'm still wrapping my head around the heap recorder, but here's a bunch of comments before I head out to lunch ;) (I fully reviewed other files, other than the heap_recorder)

benchmarks/profiler_sample_loop_v2.rb Outdated Show resolved Hide resolved
Comment on lines 336 to 341
option :allocation_counting_enabled do |o|
o.type :bool
o.env 'DD_PROFILING_EXPERIMENTAL_ALLOCATION_ENABLED'
o.default do
RUBY_VERSION.start_with?('2.') ||
(RUBY_VERSION.start_with?('3.1.') && RUBY_VERSION >= '3.1.4') ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually thinking of separating allocation counting (this current setting) from allocation profiling (and introducing an experimental_allocation_enabled). In particular, since counting is pretty simple and even enabled by default in a bunch of situations nowadays.

One advantage of separating them would be to avoid overriding the version checks above accidentally. Alternatively, we could move the version checks to profiling/component.rb and emit warnings about them -- e.g. "you're enabling this feature on a Ruby known to have bugs, be careful" kinda thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in latest commit 👍

lib/datadog/core/configuration/settings.rb Outdated Show resolved Hide resolved
lib/datadog/profiling/component.rb Outdated Show resolved Hide resolved
lib/datadog/profiling/component.rb Outdated Show resolved Hide resolved
ext/ddtrace_profiling_native_extension/heap_recorder.c Outdated Show resolved Hide resolved
// not end up with a still active recording. new_obj still holds the object for this recording
active_recording->obj = 0;

heap_stack *heap_stack = heap_stack_init(locations);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess not for V1, but I'm thinking we can probably dedupe the locations at this step, even without acquiring the lock?

That is, if the lock in practice is protecting the reader (which is flushing the profile) from the writer (which is taking a sample), then it's ok if the writer accesses the hashtables, as long as it doesn't write to it.

(Of course we'd probably need to document that very clearly -- that on the flush path we would not be allowed to mutate the hashtables)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I want to rewrite things slightly to handle

} else {
// FIXME: Figure out a way to not have to instantiate a new stack only to free it if it's
// already sampled. Something like supporting indexing the heap_records table with
// ddog_prof_Slice_Location objects directly for instance.
which would allow us to more easily arrive at something that accomplishes what you describe.

ext/ddtrace_profiling_native_extension/heap_recorder.c Outdated Show resolved Hide resolved
ext/ddtrace_profiling_native_extension/heap_recorder.c Outdated Show resolved Hide resolved
Comment on lines 432 to 433
frame->name = ruby_strdup(location->function.name.ptr);
frame->filename = ruby_strdup(location->function.filename.ptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a couple of assumptions here that I'm not 100% sure about:

  • That ptr is always a null-terminated string: I think so... but... technically... we're getting this from a Ruby string, and I think it's possible for Ruby strings to not be null-terminated? Or at least I'm not entirely confident.
  • That there's always a value in the ptr

When we feed the locations to libdatadog, we give it a pointer and a length (which I'm not sure is always non-zero) so neither of those problems would occur, but here since we're just using the pointers without the length I'm a bit more unsure.

(Note that it doesn't mean we need to change stuff here -- maybe we just test these things when building the locations so we don't have to care here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Adding a ruby_strndup helper to deal with this for now.

Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more notes! :)

typedef struct {
VALUE obj;
unsigned int weight;
ddog_CharSlice *class_name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not using the class name yet, but do note this pointer is pointing at something that's stack-allocated, so it's only valid while taking the sample (aka it'll need to be copied if we want to keep it for later).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it from here for now. There was an implicit assumption here that between start_heap_allocation_recording and end_heap_allocation_recording any passed parameters would remain alive but I'll tackle this differently later.

Comment on lines 181 to 185
if (!st_lookup(heap_recorder->heap_records, (st_data_t) heap_stack, (st_data_t*) &heap_record)) {
heap_record = heap_record_init(heap_stack);
if (st_insert(heap_recorder->heap_records, (st_data_t) heap_stack, (st_data_t) heap_record)) {
rb_raise(rb_eRuntimeError, "Duplicate heap stack tracking: %p", heap_stack);
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, when would this happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only if hash+cmp functions had a bug. But my time programming in Go left me with an innate fear of ignoring return codes therefore the explicit handling of this apparent impossibility

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is one of those throw "Should never happen" that in a few months actually shows up in prod 🤣

Comment on lines 462 to 489
int heap_stack_cmp(heap_stack *st1, heap_stack *st2) {
if (st1->frames_len != st2->frames_len) {
return (int) (st1->frames_len - st2->frames_len);
}
for (uint64_t i = 0; i < st1->frames_len; i++) {
int cmp = heap_frame_cmp(&st1->frames[i], &st2->frames[i]);
if (cmp != 0) {
return cmp;
}
}
return 0;
}

int heap_stack_cmp_st(st_data_t key1, st_data_t key2) {
return heap_stack_cmp((heap_stack *) key1, (heap_stack *) key2);
}

st_index_t heap_stack_hash(heap_stack *stack, st_index_t seed) {
st_index_t hash = seed;
for (uint64_t i = 0; i < stack->frames_len; i++) {
hash = heap_frame_hash(&stack->frames[i], hash);
}
return hash;
}

st_index_t heap_stack_hash_st(st_data_t key) {
return heap_stack_hash((heap_stack *) key, FNV1_32A_INIT);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I wonder if we should keep the hash somewhere, to avoid recomputing over and over (as it may get a bit expensive)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ruby hash table implementation already keeps this around to avoid recomputations on existing items: https://github.com/ruby/ruby/blob/903b0931a116691386e71fa30fb1698bbd785853/st.c#L133-L137

So this is only called on an external key (which is used on external to st.c lookup/insertion/update/deletion) and these keys would almost always be new objects and thus not have hashes precomputed anyway so I think this is fine?

Copy link
Member

@ivoanjo ivoanjo Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable (and neat that this is done for us!) -- TIL!

Comment on lines 195 to 200
if (st_insert(heap_recorder->object_records, (st_data_t) obj, (st_data_t) object_record) != 0) {
// Object already tracked?
object_record_free(object_record);
rb_raise(rb_eRuntimeError, "Duplicate heap object tracking: %lu", obj);
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my suspicion that the free tracepoint may be skipped sometimes, we may see this one happening. 🤔

ext/ddtrace_profiling_native_extension/heap_recorder.c Outdated Show resolved Hide resolved
Comment on lines 245 to 249
fprintf(stderr, "Enqueuing sample for %lu (weight=%u free=%i)\n", new_sample.obj, new_sample.weight, new_sample.free);
if (heap_recorder->queued_samples_len >= MAX_QUEUE_LIMIT) {
fprintf(stderr, "Dropping sample on the floor.\n");
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be problematic if we're dropping a free. We could prioritize frees here (e.g. clobber allocations in the buffer), but we'd always hit a limit still.

ext/ddtrace_profiling_native_extension/heap_recorder.c Outdated Show resolved Hide resolved
Comment on lines 394 to 404
int heap_frame_cmp(heap_frame *f1, heap_frame *f2) {
int cmp = strcmp(f1->name, f2->name);
if (cmp != 0) {
return cmp;
}
cmp = strcmp(f1->filename, f2->filename);
if (cmp != 0) {
return cmp;
}
return (int) (f1->line - f2->line);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: May be worth starting with the line first, since it's the cheapest check.

Base automatically changed from alexjf/ruby-native-dev-stuff to master November 20, 2023 10:51
@AlexJF AlexJF force-pushed the alexjf/prof-8543-ruby-heap-profiling-poc branch from 34ed3f0 to 2a11d34 Compare November 21, 2023 16:19
@AlexJF AlexJF force-pushed the alexjf/prof-8543-ruby-heap-profiling-poc branch from 2a11d34 to 564543c Compare November 21, 2023 16:29
@AlexJF
Copy link
Contributor Author

AlexJF commented Dec 13, 2023

Replaced with more production-ready versions starting from #3281

@AlexJF AlexJF closed this Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Involves Datadog core libraries profiling Involves Datadog profiling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants