Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROF-8299] Remove legacy profiler codepath #3172

Merged
merged 1 commit into from
Sep 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ AllCops:
- 'integration/apps/*/bin/*'
- 'integration/apps/*/Gemfile'
- 'integration/apps/hanami/Rakefile'
- 'lib/datadog/profiling/pprof/pprof_pb.rb'
- 'spec/**/**/interesting_backtrace_helper.rb' # This file needs quite a few bizarre code patterns by design
- 'vendor/bundle/**/*'
- 'spec/datadog/tracing/contrib/grpc/support/gen/**/*.rb' # Skip protoc autogenerated code
Expand Down
29 changes: 4 additions & 25 deletions .rubocop_todo.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This configuration was generated by
# `rubocop --auto-gen-config --exclude-limit 99999`
# on 2023-09-25 19:52:30 UTC using RuboCop version 1.44.1.
# on 2023-09-29 10:01:10 UTC using RuboCop version 1.50.2.
# The point is for the user to remove these configuration records
# one by one as the offenses are removed from the code base.
# Note that changes in the inspected code, or installation of new
Expand All @@ -20,26 +20,17 @@ Lint/MissingCopEnableDirective:
- 'spec/datadog/tracing/contrib/rails/support/middleware.rb'
- 'spec/support/container_helpers.rb'

# Offense count: 12
# Offense count: 7
Lint/MissingSuper:
Exclude:
- 'lib/datadog/core/workers/runtime_metrics.rb'
- 'lib/datadog/opentracer/scope.rb'
- 'lib/datadog/opentracer/span.rb'
- 'lib/datadog/opentracer/span_context.rb'
- 'lib/datadog/profiling/collectors/old_stack.rb'
- 'lib/datadog/profiling/old_recorder.rb'
- 'lib/datadog/profiling/pprof/converter.rb'
- 'lib/datadog/profiling/pprof/template.rb'
- 'lib/datadog/profiling/scheduler.rb'
- 'spec/support/faux_transport.rb'
- 'spec/support/spy_transport.rb'

# Offense count: 1
Lint/StructNewOverride:
Exclude:
- 'lib/datadog/profiling/pprof/converter.rb'

# Offense count: 1
# Configuration parameters: CountBlocks.
Metrics/BlockNesting:
Expand Down Expand Up @@ -105,12 +96,6 @@ Naming/FileName:
- YJIT
- IO

# Offense count: 2
Performance/MethodObjectAsBlock:
Exclude:
- 'lib/datadog/profiling/pprof/stack_sample.rb'
- 'lib/datadog/profiling/pprof/template.rb'

# Offense count: 3
# This cop supports unsafe autocorrection (--autocorrect-all).
RSpec/EmptyExampleGroup:
Expand Down Expand Up @@ -157,6 +142,7 @@ RSpec/RepeatedExampleGroupDescription:
- 'spec/datadog/tracing/contrib/redis/quantize_spec.rb'

# Offense count: 40
# This cop supports safe autocorrection (--autocorrect).
RSpec/ScatteredSetup:
Exclude:
- 'spec/datadog/opentracer/span_spec.rb'
Expand Down Expand Up @@ -189,7 +175,7 @@ Style/ClassVars:
- 'spec/datadog/tracing/contrib/rails/support/rails5.rb'
- 'spec/datadog/tracing/contrib/rails/support/rails6.rb'

# Offense count: 230
# Offense count: 223
# This cop supports unsafe autocorrection (--autocorrect-all).
# Configuration parameters: EnforcedStyle.
# SupportedStyles: always, always_true, never
Expand Down Expand Up @@ -236,7 +222,6 @@ Style/FrozenStringLiteralComment:
- 'lib/datadog/core/utils.rb'
- 'lib/datadog/core/utils/forking.rb'
- 'lib/datadog/core/utils/safe_dup.rb'
- 'lib/datadog/core/utils/string_table.rb'
- 'lib/datadog/core/workers/async.rb'
- 'lib/datadog/core/workers/polling.rb'
- 'lib/datadog/opentracer/distributed_headers.rb'
Expand All @@ -248,16 +233,10 @@ Style/FrozenStringLiteralComment:
- 'lib/datadog/profiling.rb'
- 'lib/datadog/profiling/collectors/cpu_and_wall_time_worker.rb'
- 'lib/datadog/profiling/collectors/idle_sampling_helper.rb'
- 'lib/datadog/profiling/collectors/old_stack.rb'
- 'lib/datadog/profiling/encoding/profile.rb'
- 'lib/datadog/profiling/exporter.rb'
- 'lib/datadog/profiling/ext/forking.rb'
- 'lib/datadog/profiling/http_transport.rb'
- 'lib/datadog/profiling/load_native_extension.rb'
- 'lib/datadog/profiling/old_recorder.rb'
- 'lib/datadog/profiling/pprof/converter.rb'
- 'lib/datadog/profiling/pprof/stack_sample.rb'
- 'lib/datadog/profiling/pprof/template.rb'
- 'lib/datadog/profiling/profiler.rb'
- 'lib/datadog/profiling/scheduler.rb'
- 'lib/datadog/profiling/stack_recorder.rb'
Expand Down
2 changes: 1 addition & 1 deletion Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ end
# @see https://github.com/DataDog/dogstatsd-ruby/issues/182
gem 'dogstatsd-ruby', '>= 3.3.0', '!= 5.0.0', '!= 5.0.1', '!= 5.1.0'

# Profiler optional dependencies
# Profiler testing dependencies
# NOTE: We're excluding versions 3.7.0 and 3.7.1 for the reasons documented in #1424.
# Since most of our customers won't have BUNDLE_FORCE_RUBY_PLATFORM=true, it's not something we want to add
# to our CI, so we just shortcut and exclude specific versions that were affecting our CI.
Expand Down
22 changes: 0 additions & 22 deletions Steepfile
Original file line number Diff line number Diff line change
Expand Up @@ -126,11 +126,9 @@ target :ddtrace do
ignore 'lib/datadog/core/utils/forking.rb'
ignore 'lib/datadog/core/utils/hash.rb' # Refinement module
ignore 'lib/datadog/core/utils/network.rb'
ignore 'lib/datadog/core/utils/object_set.rb'
ignore 'lib/datadog/core/utils/only_once.rb'
ignore 'lib/datadog/core/utils/safe_dup.rb'
ignore 'lib/datadog/core/utils/sequence.rb'
ignore 'lib/datadog/core/utils/string_table.rb'
ignore 'lib/datadog/core/utils/time.rb'
ignore 'lib/datadog/core/vendor/ipaddr.rb'
ignore 'lib/datadog/core/vendor/multipart-post/multipart.rb'
Expand Down Expand Up @@ -171,31 +169,11 @@ target :ddtrace do
ignore 'lib/datadog/opentracer/thread_local_scope.rb'
ignore 'lib/datadog/opentracer/thread_local_scope_manager.rb'
ignore 'lib/datadog/opentracer/tracer.rb'
ignore 'lib/datadog/profiling/backtrace_location.rb'
ignore 'lib/datadog/profiling/buffer.rb'
ignore 'lib/datadog/profiling/collectors/code_provenance.rb'
ignore 'lib/datadog/profiling/collectors/old_stack.rb'
ignore 'lib/datadog/profiling/collectors/stack.rb'
ignore 'lib/datadog/profiling/diagnostics/environment_logger.rb'
ignore 'lib/datadog/profiling/encoding/profile.rb'
ignore 'lib/datadog/profiling/event.rb'
ignore 'lib/datadog/profiling/events/stack.rb'
ignore 'lib/datadog/profiling/ext/forking.rb'
ignore 'lib/datadog/profiling/old_recorder.rb'
ignore 'lib/datadog/profiling/pprof/builder.rb'
ignore 'lib/datadog/profiling/pprof/converter.rb'
ignore 'lib/datadog/profiling/pprof/message_set.rb'
ignore 'lib/datadog/profiling/pprof/payload.rb'
ignore 'lib/datadog/profiling/pprof/pprof_pb.rb'
ignore 'lib/datadog/profiling/pprof/stack_sample.rb'
ignore 'lib/datadog/profiling/pprof/string_table.rb'
ignore 'lib/datadog/profiling/pprof/template.rb'
ignore 'lib/datadog/profiling/scheduler.rb'
ignore 'lib/datadog/profiling/tag_builder.rb'
ignore 'lib/datadog/profiling/tasks/exec.rb'
ignore 'lib/datadog/profiling/tasks/help.rb'
ignore 'lib/datadog/profiling/tasks/setup.rb'
ignore 'lib/datadog/profiling/trace_identifiers/ddtrace.rb'
ignore 'lib/datadog/tracing.rb'
ignore 'lib/datadog/tracing/analytics.rb'
ignore 'lib/datadog/tracing/buffer.rb'
Expand Down
83 changes: 0 additions & 83 deletions benchmarks/profiler_sample_loop.rb

This file was deleted.

104 changes: 31 additions & 73 deletions docs/ProfilingDevelopment.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Profiling Development

This file contains development notes specific to the profiling feature.
This file contains development notes specific to the continuous profiler.

For a more practical view of getting started with development of `ddtrace`, see <DevelopmentGuide.md>.

Expand All @@ -11,32 +11,20 @@ organized in Ruby classes, and the Ruby classes are still created in their corre

Components below live inside <../lib/datadog/profiling>:

* (Deprecated) `Collectors::OldStack`: Collects stack trace samples from Ruby threads for both CPU-time (if available) and wall-clock.
Runs on its own background thread.
* `Collectors::CodeProvenance`: Collects library metadata to power grouping and categorization of stack traces (e.g. to help distinguish user code,
from libraries, from the standard library, etc).
* `Collectors::ThreadContext`: Collects samples of living Ruby threads, based on external events (periodic timer, GC happening, allocations, ...),
recording a metric for them (such as elapsed cpu-time or wall-time, in a few cases) and labeling them with thread id and thread name, as well as
with ongoing tracing information, if any. Relies on the `Collectors::Stack` for the actual stack sampling.
* `Collectors::CpuAndWallTimeWorker`: Triggers the periodic execution of `Collectors::ThreadContext`.
* `Collectors::Stack`: Used to gather a stack trace from a given Ruby thread. Stores its output on a `StackRecorder`.

* (Deprecated) `Encoding::Profile::Protobuf`: Encodes gathered data into the pprof format.
* (Deprecated) `Events::Stack`, `Events::StackSample`: Entity classes used to represent stacks.
* `Ext::Forking`: Monkey patches `Kernel#fork`, adding a `Kernel#at_fork` callback mechanism which is used to restore
profiling abilities after the VM forks (such as re-instrumenting the main thread, and restarting profiler threads).
* (Deprecated) `Pprof::*` (in <../lib/datadog/profiling/pprof>): Used by `Encoding::Profile::Protobuf` to convert samples captured in
the `OldRecorder` into the pprof format.
* `Tasks::Setup`: Takes care of loading and applying `Ext::Forking``.
* `HttpTransport`: Implements transmission of profiling payloads to the Datadog agent or backend.
* (Deprecated) `TraceIdentifiers::*`: Used to retrieve trace id and span id from tracers, to be used to connect traces to profiles.
* (Deprecated) `BacktraceLocation`: Entity class used to represent an entry in a stack trace.
* (Deprecated) `Buffer`: Bounded buffer used to store profiling events.
* (Deprecated) `Event`
* `Flush`: Entity class used to represent the payload to be reported for a given profile.
* `Profiler`: Profiling entry point, which coordinates collectors and a scheduler.
* (Deprecated) `OldRecorder`: Stores profiling events gathered by the `Collector::OldStack`. (To be removed after migration to libddprof aggregation)
* `Exporter`: Gathers data from `OldRecorder` and `Collectors::CodeProvenance` to be reported as a profile.
* `Exporter`: Gathers data from `StackRecorder` and `Collectors::CodeProvenance` to be reported as a profile.
* `Scheduler`: Periodically (every 1 minute) takes data from the `Exporter` and pushes them to the configured transport.
Runs on its own background thread.
* `StackRecorder`: Stores stack samples in a native libdatadog data structure and exposes Ruby-level serialization APIs.
Expand All @@ -50,69 +38,39 @@ flow:
1. <../lib/datadog/profiling/preload.rb> triggers the creation of the profiler instance by calling the method `Datadog::Profiling.start_if_enabled`
2. Creation of the profiler instance is handled by `Datadog::Configuration`, which triggers the configuration of all
`ddtrace` components in `#build_components`
3. Inside `Datadog::Components`, the `build_profiler` method triggers the execution of the `Tasks::Setup` task
3. Inside `Datadog::Components`, the `build_profiler_component` method gets called
4. The `Setup` task activates our extensions (`Datadog::Profiling::Ext::Forking`)
5. Still inside `Datadog::Components`, the `build_profiler` method then creates and wires up the Profiler as such:
5. The `build_profiler_component` method then creates and wires up the Profiler as such:
```asciiflow
+------------+
| Profiler |
+-+-------+--+
| |
v v
+---------+--+ +-+---------+
| Collectors | | Scheduler |
+---------+--+ +-+-------+-+
| | |
v | v
+--------+-+ | +----+----------+
| OldStack | | | HttpTransport |
+--------+-+ | +---------------+
| |
v v
+---------+---+ ++---------+
| OldRecorder |<-| Exporter |
+-------------+ +-+--------+
|
v
+--------------+--+
| Code Provenance |
+-----------------+
+----------------------------------+
| Profiler |
+-+------------------------------+-+
| |
v v
+---------+------------------------+ +-+---------+
| Collectors::CpuAndWallTimeWorker | | Scheduler |
+---------+------------------------+ +-+-------+-+
| | |
| | v
| | +----+----------+
(... see "How sampling happens" ...) | | HttpTransport |
| | +---------------+
| |
v v
+-------+-------+ +--+-------+
| StackRecorder |<------------------| Exporter |
+---------------+ +--+-------+
|
v
+--------------+-------------+
| Collectors::CodeProvenance |
+----------------------------+
```
6. The profiler gets started when `startup!` is called by `Datadog::Configuration` after component creation.

### Work in progress

The profiler is undergoing a lot of refactoring. After this work is done, this is how we expect it will be wired up:

```asciiflow
+----------------------------------+
| Profiler |
+-+------------------------------+-+
| |
v v
+---------+------------------------+ +-+---------+
| Collectors::CpuAndWallTimeWorker | | Scheduler |
+---------+------------------------+ +-+-------+-+
| | |
| | v
| | +----+----------+
(... see "How sampling happens" ...) | | HttpTransport |
| | +---------------+
| |
v v
+-------+-------+ +--+-------+
| StackRecorder |<------------------| Exporter |
+---------------+ +--+-------+
|
v
+--------------+-------------+
| Collectors::CodeProvenance |
+----------------------------+
```

## Run-time execution

During run-time, the `Scheduler` and the `Collectors::CpuAndWallTimeWorker` (`Collectors::OldStack` for the legacy profiler) each execute on their own background thread.
During run-time, the `Scheduler` and the `Collectors::CpuAndWallTimeWorker` each execute on their own background thread.

The `Scheduler` wakes up every 1 minute to flush the results of the `Exporter` into the `HttpTransport` (see above).

Expand Down Expand Up @@ -156,7 +114,7 @@ It then kicks off a pipeline of components, each of which is responsible for gat
└────────────┘
```

All of these components are executed synchronously, and at the end of recording the sample in libdatadog, control is returned back "up"
All of these components are executed synchronously, and at the end of recording the sample in libdatadog, control is returned back "up" the stack
through each one, until the `CpuAndWallTimeWorker` finishes its work and control returns to the Ruby VM.

## How linking of traces to profiles works
Expand All @@ -178,8 +136,8 @@ whenever they share the same `runtime-id`.
To further enable filtering of a profile to show only samples related to a given trace/span, each sample taken by the
profiler is tagged with the `local root span id` and `span id` for the given trace/span.

This is done using the `Datadog::Profiling::TraceIdentifiers::Helper` that retrieves a `root_span_id` and `span_id`, if
available, from the supported tracers. This helper is called by the `Collectors::OldStack` during sampling.
This is done inside the `Collectors::ThreadContext` that retrieves a `root_span_id` and `span_id`, if
available, from the tracer.

Note that if a given trace executes too fast, it's possible that the profiler will not contain any samples for that
specific trace. Nevertheless, the linking still works and is useful, as it allows users to explore what was going on their
Expand Down
Loading
Loading