Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Profiler] Make timer_create-based CPU profiler default #6606

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

gleocadie
Copy link
Collaborator

Summary of changes

This is another try at making the timer_create-based CPU profiler the default one. (previous PR #6315)

Reason for change

We had to revert the previous PR because
pthread_exit is explicitly called by the CLR, which starts cleaning up the thread thread_locals. But at the same time, the CPU timer_create-base kicks in and tries using, indirectly, those thread_locals.

This situation lead to crashes.
In this PR, we wrap pthread_create to remove the cpu timer on the thread before continuing with pthread_exit

Implementation details

  • Wrap pthread_exit to call __dd_on_thread_routine_finished before calling the real pthread_exit
  • Make timer_create as default cpu profiler

Test coverage

Updated the tests

Other details

@gleocadie gleocadie force-pushed the gleocadie/another-try-at-gaing-timer_create branch from ace076b to c47cbdd Compare January 30, 2025 10:36
@github-actions github-actions bot added the area:profiler Issues related to the continous-profiler label Jan 30, 2025
@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Jan 30, 2025

Datadog Report

Branch report: gleocadie/another-try-at-gaing-timer_create
Commit report: 916eaf5
Test service: dd-trace-dotnet

❌ 734 Failed (2 Known Flaky), 243094 Passed, 2558 Skipped, 18h 47m 5.61s Total Time

❌ Failed Tests (734)

This report shows up to 5 failed tests.

  • SubmitsTracesV0 - Datadog.Trace.ClrProfiler.IntegrationTests.AdoNet.DapperTests - Details

    Expand for error
     Expected exit code: 0, actual exit code: 139.
    
  • SubmitsTracesV0 - Datadog.Trace.ClrProfiler.IntegrationTests.AdoNet.DapperTests - Details

    Expand for error
     Expected exit code: 0, actual exit code: 139.
    
  • SubmitsTracesV0 - Datadog.Trace.ClrProfiler.IntegrationTests.AdoNet.DapperTests - Details

    Expand for error
     Expected exit code: 0, actual exit code: 139.
    
  • SubmitsTracesV1 - Datadog.Trace.ClrProfiler.IntegrationTests.AdoNet.DapperTests - Details

    Expand for error
     Expected exit code: 0, actual exit code: 139.
    
  • SubmitsTracesV1 - Datadog.Trace.ClrProfiler.IntegrationTests.AdoNet.DapperTests - Details

    Expand for error
     Expected exit code: 0, actual exit code: 139.
    

@andrewlock
Copy link
Member

andrewlock commented Jan 30, 2025

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (69ms)  : 64, 75
     .   : milestone, 69,
    master - mean (69ms)  : 66, 71
     .   : milestone, 69,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (996ms)  : 976, 1016
     .   : milestone, 996,
    master - mean (996ms)  : 972, 1019
     .   : milestone, 996,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (102ms)  : 99, 104
     .   : milestone, 102,
    master - mean (101ms)  : 99, 103
     .   : milestone, 101,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (675ms)  : 653, 696
     .   : milestone, 675,
    master - mean (675ms)  : 660, 691
     .   : milestone, 675,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (89ms)  : 87, 91
     .   : milestone, 89,
    master - mean (89ms)  : 87, 91
     .   : milestone, 89,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (634ms)  : 614, 654
     .   : milestone, 634,
    master - mean (634ms)  : 614, 654
     .   : milestone, 634,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (192ms)  : 188, 195
     .   : milestone, 192,
    master - mean (191ms)  : 187, 195
     .   : milestone, 191,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (1,112ms)  : 1083, 1141
     .   : milestone, 1112,
    master - mean (1,113ms)  : 1085, 1140
     .   : milestone, 1113,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (272ms)  : 268, 276
     .   : milestone, 272,
    master - mean (271ms)  : 266, 276
     .   : milestone, 271,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (868ms)  : 834, 902
     .   : milestone, 868,
    master - mean (868ms)  : 837, 898
     .   : milestone, 868,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (264ms)  : 260, 268
     .   : milestone, 264,
    master - mean (262ms)  : 258, 267
     .   : milestone, 262,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (850ms)  : 814, 885
     .   : milestone, 850,
    master - mean (848ms)  : 818, 878
     .   : milestone, 848,

Loading

@andrewlock
Copy link
Member

andrewlock commented Jan 30, 2025

Benchmarks Report for tracer 🐌

Benchmarks for #6606 compared to master:

  • All benchmarks have the same speed
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartStopWithChild net6.0 7.89μs 45ns 321ns 0.0148 0.00738 0 5.61 KB
master StartStopWithChild netcoreapp3.1 10.2μs 56ns 331ns 0.0244 0.00977 0 5.8 KB
master StartStopWithChild net472 16μs 61ns 236ns 1.04 0.299 0.102 6.22 KB
#6606 StartStopWithChild net6.0 7.85μs 44.8ns 320ns 0.0151 0.00757 0 5.61 KB
#6606 StartStopWithChild netcoreapp3.1 10.1μs 50.4ns 220ns 0.0208 0.0104 0 5.8 KB
#6606 StartStopWithChild net472 16.2μs 59ns 228ns 1.05 0.32 0.104 6.2 KB
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 491μs 526ns 2.04μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 648μs 459ns 1.78μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces net472 870μs 491ns 1.77μs 0.431 0 0 3.3 KB
#6606 WriteAndFlushEnrichedTraces net6.0 489μs 357ns 1.38μs 0 0 0 2.7 KB
#6606 WriteAndFlushEnrichedTraces netcoreapp3.1 697μs 522ns 1.95μs 0 0 0 2.7 KB
#6606 WriteAndFlushEnrichedTraces net472 846μs 979ns 3.79μs 0.419 0 0 3.3 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 130μs 525ns 2.03μs 0.198 0 0 14.47 KB
master SendRequest netcoreapp3.1 143μs 500ns 1.94μs 0.218 0 0 17.27 KB
master SendRequest net472 0.00978ns 0.00216ns 0.00835ns 0 0 0 0 b
#6606 SendRequest net6.0 132μs 481ns 1.86μs 0.133 0 0 14.47 KB
#6606 SendRequest netcoreapp3.1 147μs 185ns 718ns 0.221 0 0 17.27 KB
#6606 SendRequest net472 0.0138ns 0.00157ns 0.0061ns 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 583μs 3.25μs 20.3μs 0.598 0 0 41.63 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 651μs 2.86μs 11.1μs 0.316 0 0 41.74 KB
master WriteAndFlushEnrichedTraces net472 836μs 3.85μs 14.4μs 8.19 2.59 0.431 53.27 KB
#6606 WriteAndFlushEnrichedTraces net6.0 588μs 3.33μs 22.3μs 0.279 0 0 41.61 KB
#6606 WriteAndFlushEnrichedTraces netcoreapp3.1 665μs 3.62μs 26.3μs 0.336 0 0 41.62 KB
#6606 WriteAndFlushEnrichedTraces net472 837μs 3.86μs 14.9μs 8.28 2.48 0.414 53.33 KB
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 1.35μs 1.2ns 4.65ns 0.0143 0 0 1.02 KB
master ExecuteNonQuery netcoreapp3.1 1.71μs 1.13ns 4.36ns 0.0138 0 0 1.02 KB
master ExecuteNonQuery net472 2.11μs 2.75ns 10.7ns 0.156 0.00106 0 987 B
#6606 ExecuteNonQuery net6.0 1.41μs 1.96ns 7.6ns 0.0141 0 0 1.02 KB
#6606 ExecuteNonQuery netcoreapp3.1 1.81μs 1.15ns 4.47ns 0.0136 0 0 1.02 KB
#6606 ExecuteNonQuery net472 2.03μs 2.53ns 9.8ns 0.156 0.00102 0 987 B
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.31μs 0.593ns 2.22ns 0.0139 0 0 976 B
master CallElasticsearch netcoreapp3.1 1.54μs 0.821ns 3.18ns 0.0128 0 0 976 B
master CallElasticsearch net472 2.51μs 1.61ns 6.23ns 0.157 0 0 995 B
master CallElasticsearchAsync net6.0 1.29μs 0.767ns 2.97ns 0.013 0 0 952 B
master CallElasticsearchAsync netcoreapp3.1 1.65μs 0.879ns 3.17ns 0.014 0 0 1.02 KB
master CallElasticsearchAsync net472 2.51μs 0.746ns 2.69ns 0.167 0 0 1.05 KB
#6606 CallElasticsearch net6.0 1.24μs 0.399ns 1.49ns 0.0137 0 0 976 B
#6606 CallElasticsearch netcoreapp3.1 1.53μs 1.07ns 4.15ns 0.0129 0 0 976 B
#6606 CallElasticsearch net472 2.6μs 1.21ns 4.67ns 0.158 0 0 995 B
#6606 CallElasticsearchAsync net6.0 1.3μs 0.918ns 3.43ns 0.013 0 0 952 B
#6606 CallElasticsearchAsync netcoreapp3.1 1.66μs 1.52ns 5.68ns 0.0134 0 0 1.02 KB
#6606 CallElasticsearchAsync net472 2.67μs 0.796ns 3.08ns 0.167 0 0 1.05 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.19μs 0.383ns 1.48ns 0.0137 0 0 952 B
master ExecuteAsync netcoreapp3.1 1.62μs 1.5ns 5.79ns 0.0122 0 0 952 B
master ExecuteAsync net472 1.84μs 0.6ns 2.16ns 0.145 0 0 915 B
#6606 ExecuteAsync net6.0 1.3μs 0.773ns 2.89ns 0.013 0 0 952 B
#6606 ExecuteAsync netcoreapp3.1 1.58μs 1.52ns 5.27ns 0.0128 0 0 952 B
#6606 ExecuteAsync net472 1.89μs 2.17ns 8.42ns 0.145 0 0 915 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 4.42μs 1.17ns 4.37ns 0.0309 0 0 2.31 KB
master SendAsync netcoreapp3.1 5.35μs 5.4ns 19.5ns 0.0376 0 0 2.85 KB
master SendAsync net472 7.42μs 1.17ns 4.37ns 0.494 0 0 3.12 KB
#6606 SendAsync net6.0 4.43μs 1.61ns 6.22ns 0.0332 0 0 2.31 KB
#6606 SendAsync netcoreapp3.1 5.44μs 1.86ns 6.7ns 0.0383 0 0 2.85 KB
#6606 SendAsync net472 7.44μs 1.1ns 4.1ns 0.494 0 0 3.12 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 1.47μs 0.602ns 2.33ns 0.0227 0 0 1.64 KB
master EnrichedLog netcoreapp3.1 2.18μs 0.913ns 3.41ns 0.0216 0 0 1.64 KB
master EnrichedLog net472 2.54μs 0.954ns 3.57ns 0.249 0 0 1.57 KB
#6606 EnrichedLog net6.0 1.62μs 0.988ns 3.83ns 0.0228 0 0 1.64 KB
#6606 EnrichedLog netcoreapp3.1 2.36μs 1.53ns 5.93ns 0.0222 0 0 1.64 KB
#6606 EnrichedLog net472 2.61μs 0.615ns 2.22ns 0.25 0 0 1.57 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 113μs 140ns 525ns 0.0562 0 0 4.28 KB
master EnrichedLog netcoreapp3.1 119μs 152ns 590ns 0.0595 0 0 4.28 KB
master EnrichedLog net472 149μs 119ns 460ns 0.669 0.223 0 4.46 KB
#6606 EnrichedLog net6.0 115μs 221ns 857ns 0.0563 0 0 4.28 KB
#6606 EnrichedLog netcoreapp3.1 116μs 122ns 455ns 0.0581 0 0 4.28 KB
#6606 EnrichedLog net472 150μs 154ns 597ns 0.677 0.226 0 4.46 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 3.09μs 0.966ns 3.74ns 0.0307 0 0 2.2 KB
master EnrichedLog netcoreapp3.1 4.1μs 1.99ns 7.69ns 0.0287 0 0 2.2 KB
master EnrichedLog net472 4.82μs 1.91ns 7.15ns 0.319 0 0 2.02 KB
#6606 EnrichedLog net6.0 3.1μs 0.814ns 2.94ns 0.0305 0 0 2.2 KB
#6606 EnrichedLog netcoreapp3.1 4.04μs 1.4ns 5.42ns 0.0302 0 0 2.2 KB
#6606 EnrichedLog net472 4.93μs 1.49ns 5.59ns 0.32 0 0 2.02 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 1.33μs 0.776ns 3.01ns 0.0159 0 0 1.14 KB
master SendReceive netcoreapp3.1 1.88μs 2.78ns 10.8ns 0.015 0 0 1.14 KB
master SendReceive net472 2.05μs 0.792ns 2.96ns 0.183 0 0 1.16 KB
#6606 SendReceive net6.0 1.26μs 0.658ns 2.46ns 0.0158 0 0 1.14 KB
#6606 SendReceive netcoreapp3.1 1.7μs 0.739ns 2.86ns 0.0154 0 0 1.14 KB
#6606 SendReceive net472 2.08μs 0.292ns 1.13ns 0.183 0 0 1.16 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.88μs 0.712ns 2.76ns 0.0217 0 0 1.6 KB
master EnrichedLog netcoreapp3.1 3.82μs 1.98ns 7.67ns 0.021 0 0 1.65 KB
master EnrichedLog net472 4.29μs 4.29ns 16ns 0.322 0 0 2.04 KB
#6606 EnrichedLog net6.0 2.8μs 1.45ns 5.61ns 0.0223 0 0 1.6 KB
#6606 EnrichedLog netcoreapp3.1 3.95μs 2.19ns 8.5ns 0.0217 0 0 1.65 KB
#6606 EnrichedLog net472 4.29μs 2.08ns 8.05ns 0.324 0 0 2.04 KB
Benchmarks.Trace.SpanBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 397ns 0.466ns 1.8ns 0.00815 0 0 576 B
master StartFinishSpan netcoreapp3.1 607ns 0.892ns 3.45ns 0.0076 0 0 576 B
master StartFinishSpan net472 589ns 0.939ns 3.51ns 0.0918 0 0 578 B
master StartFinishScope net6.0 485ns 0.717ns 2.78ns 0.00983 0 0 696 B
master StartFinishScope netcoreapp3.1 742ns 2.23ns 8.62ns 0.00941 0 0 696 B
master StartFinishScope net472 828ns 1.73ns 6.72ns 0.104 0 0 658 B
#6606 StartFinishSpan net6.0 427ns 2.4ns 15.8ns 0.00812 0 0 576 B
#6606 StartFinishSpan netcoreapp3.1 600ns 1.81ns 7.01ns 0.00795 0 0 576 B
#6606 StartFinishSpan net472 591ns 0.489ns 1.89ns 0.0915 0 0 578 B
#6606 StartFinishScope net6.0 500ns 0.525ns 2.03ns 0.0097 0 0 696 B
#6606 StartFinishScope netcoreapp3.1 794ns 0.316ns 1.18ns 0.00952 0 0 696 B
#6606 StartFinishScope net472 829ns 1.29ns 4.98ns 0.104 0 0 658 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 643ns 0.873ns 3.38ns 0.0096 0 0 696 B
master RunOnMethodBegin netcoreapp3.1 947ns 1.87ns 7.23ns 0.00917 0 0 696 B
master RunOnMethodBegin net472 1.09μs 2.22ns 8.29ns 0.105 0 0 658 B
#6606 RunOnMethodBegin net6.0 676ns 0.422ns 1.63ns 0.00978 0 0 696 B
#6606 RunOnMethodBegin netcoreapp3.1 937ns 0.855ns 3.31ns 0.00947 0 0 696 B
#6606 RunOnMethodBegin net472 1.13μs 1.3ns 5.04ns 0.104 0 0 658 B

@gleocadie gleocadie force-pushed the gleocadie/another-try-at-gaing-timer_create branch 5 times, most recently from 0bd7fdb to 524002e Compare February 4, 2025 13:04
@gleocadie gleocadie force-pushed the gleocadie/another-try-at-gaing-timer_create branch from 524002e to 44c8009 Compare February 6, 2025 21:12
Copy link
Contributor

github-actions bot commented Feb 6, 2025

Snapshots difference summary

The following differences have been observed in committed snapshots. It is meant to help the reviewer.
The diff is simplistic, so please check some files anyway while we improve it.

@gleocadie gleocadie force-pushed the gleocadie/another-try-at-gaing-timer_create branch 2 times, most recently from 8d48fad to 916eaf5 Compare February 6, 2025 22:32
@gleocadie gleocadie force-pushed the gleocadie/another-try-at-gaing-timer_create branch from 916eaf5 to c6e5636 Compare February 17, 2025 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:profiler Issues related to the continous-profiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants