-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter out dead threads in runtime metrics #6298
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🙈
Datadog ReportBranch report: ✅ 0 Failed, 455778 Passed, 2751 Skipped, 19h 56m 28.36s Total Time |
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing the following branches/commits: Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6298) - mean (72ms) : 63, 81
. : milestone, 72,
master - mean (72ms) : 64, 80
. : milestone, 72,
section CallTarget+Inlining+NGEN
This PR (6298) - mean (1,112ms) : 1091, 1133
. : milestone, 1112,
master - mean (1,107ms) : 1085, 1128
. : milestone, 1107,
gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6298) - mean (109ms) : 107, 111
. : milestone, 109,
master - mean (108ms) : 106, 110
. : milestone, 108,
section CallTarget+Inlining+NGEN
This PR (6298) - mean (770ms) : 757, 784
. : milestone, 770,
master - mean (771ms) : 757, 785
. : milestone, 771,
gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6298) - mean (93ms) : 90, 97
. : milestone, 93,
master - mean (92ms) : 90, 93
. : milestone, 92,
section CallTarget+Inlining+NGEN
This PR (6298) - mean (731ms) : 711, 750
. : milestone, 731,
master - mean (726ms) : 707, 745
. : milestone, 726,
gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6298) - mean (191ms) : 185, 197
. : milestone, 191,
master - mean (191ms) : 186, 195
. : milestone, 191,
section CallTarget+Inlining+NGEN
This PR (6298) - mean (1,218ms) : 1189, 1247
. : milestone, 1218,
master - mean (1,213ms) : 1190, 1236
. : milestone, 1213,
gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6298) - mean (277ms) : 272, 282
. : milestone, 277,
master - mean (276ms) : 271, 281
. : milestone, 276,
section CallTarget+Inlining+NGEN
This PR (6298) - mean (949ms) : 934, 965
. : milestone, 949,
master - mean (948ms) : 929, 967
. : milestone, 948,
gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6298) - mean (267ms) : 263, 270
. : milestone, 267,
master - mean (265ms) : 261, 270
. : milestone, 265,
section CallTarget+Inlining+NGEN
This PR (6298) - mean (932ms) : 914, 951
. : milestone, 932,
master - mean (931ms) : 913, 950
. : milestone, 931,
|
Benchmarks Report for tracer 🐌Benchmarks for #6298 compared to master:
The following thresholds were used for comparing the benchmark speeds:
Allocation changes below 0.5% are ignored. Benchmark detailsBenchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.GraphQLBenchmark - Faster 🎉 Same allocations ✔️
|
Benchmark | base/diff | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync‑net6.0 | 1.197 | 1,508.61 | 1,260.67 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | ExecuteAsync |
net6.0 | 1.51μs | 0.534ns | 2.07ns | 0.0134 | 0 | 0 | 952 B |
master | ExecuteAsync |
netcoreapp3.1 | 1.59μs | 0.857ns | 3.2ns | 0.0127 | 0 | 0 | 952 B |
master | ExecuteAsync |
net472 | 1.8μs | 0.547ns | 2.12ns | 0.145 | 0 | 0 | 915 B |
#6298 | ExecuteAsync |
net6.0 | 1.26μs | 0.763ns | 2.86ns | 0.0135 | 0 | 0 | 952 B |
#6298 | ExecuteAsync |
netcoreapp3.1 | 1.61μs | 1.21ns | 4.68ns | 0.0128 | 0 | 0 | 952 B |
#6298 | ExecuteAsync |
net472 | 1.76μs | 0.457ns | 1.77ns | 0.145 | 0 | 0 | 915 B |
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | SendAsync |
net6.0 | 4.3μs | 3.14ns | 12.2ns | 0.0325 | 0 | 0 | 2.31 KB |
master | SendAsync |
netcoreapp3.1 | 5.23μs | 5.65ns | 21.9ns | 0.0368 | 0 | 0 | 2.85 KB |
master | SendAsync |
net472 | 7.23μs | 8.75ns | 33.9ns | 0.494 | 0 | 0 | 3.12 KB |
#6298 | SendAsync |
net6.0 | 4.45μs | 1.91ns | 7.4ns | 0.0312 | 0 | 0 | 2.31 KB |
#6298 | SendAsync |
netcoreapp3.1 | 5.25μs | 1.69ns | 6.31ns | 0.0391 | 0 | 0 | 2.85 KB |
#6298 | SendAsync |
net472 | 7.4μs | 1.97ns | 7.64ns | 0.492 | 0 | 0 | 3.12 KB |
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | EnrichedLog |
net6.0 | 1.58μs | 0.613ns | 2.3ns | 0.0232 | 0 | 0 | 1.64 KB |
master | EnrichedLog |
netcoreapp3.1 | 2.19μs | 1.32ns | 4.76ns | 0.0221 | 0 | 0 | 1.64 KB |
master | EnrichedLog |
net472 | 2.7μs | 5.51ns | 21.3ns | 0.25 | 0 | 0 | 1.57 KB |
#6298 | EnrichedLog |
net6.0 | 1.49μs | 1.34ns | 5.03ns | 0.0228 | 0 | 0 | 1.64 KB |
#6298 | EnrichedLog |
netcoreapp3.1 | 2.28μs | 2.46ns | 9.22ns | 0.0227 | 0 | 0 | 1.64 KB |
#6298 | EnrichedLog |
net472 | 2.56μs | 1.57ns | 5.88ns | 0.249 | 0 | 0 | 1.57 KB |
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | EnrichedLog |
net6.0 | 119μs | 237ns | 918ns | 0.06 | 0 | 0 | 4.28 KB |
master | EnrichedLog |
netcoreapp3.1 | 124μs | 114ns | 443ns | 0 | 0 | 0 | 4.28 KB |
master | EnrichedLog |
net472 | 152μs | 187ns | 725ns | 0.687 | 0.229 | 0 | 4.46 KB |
#6298 | EnrichedLog |
net6.0 | 118μs | 153ns | 594ns | 0 | 0 | 0 | 4.28 KB |
#6298 | EnrichedLog |
netcoreapp3.1 | 124μs | 117ns | 453ns | 0 | 0 | 0 | 4.28 KB |
#6298 | EnrichedLog |
net472 | 153μs | 123ns | 478ns | 0.692 | 0.231 | 0 | 4.46 KB |
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | EnrichedLog |
net6.0 | 3.07μs | 2.25ns | 8.73ns | 0.0306 | 0 | 0 | 2.2 KB |
master | EnrichedLog |
netcoreapp3.1 | 4.01μs | 1.44ns | 5.57ns | 0.03 | 0 | 0 | 2.2 KB |
master | EnrichedLog |
net472 | 4.88μs | 5.97ns | 23.1ns | 0.32 | 0 | 0 | 2.02 KB |
#6298 | EnrichedLog |
net6.0 | 3.05μs | 0.932ns | 3.61ns | 0.0304 | 0 | 0 | 2.2 KB |
#6298 | EnrichedLog |
netcoreapp3.1 | 4.22μs | 1.97ns | 7.64ns | 0.0297 | 0 | 0 | 2.2 KB |
#6298 | EnrichedLog |
net472 | 4.86μs | 1.61ns | 6.03ns | 0.318 | 0 | 0 | 2.02 KB |
Benchmarks.Trace.RedisBenchmark - Slower ⚠️ Same allocations ✔️
Slower ⚠️ in #6298
Benchmark
diff/base
Base Median (ns)
Diff Median (ns)
Modality
Benchmarks.Trace.RedisBenchmark.SendReceive‑net6.0
1.113
1,316.63
1,464.89
Benchmark | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.RedisBenchmark.SendReceive‑net6.0 | 1.113 | 1,316.63 | 1,464.89 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | SendReceive |
net6.0 | 1.32μs | 0.646ns | 2.5ns | 0.0158 | 0 | 0 | 1.14 KB |
master | SendReceive |
netcoreapp3.1 | 1.73μs | 1.12ns | 4.34ns | 0.0148 | 0 | 0 | 1.14 KB |
master | SendReceive |
net472 | 1.99μs | 1.25ns | 4.84ns | 0.183 | 0 | 0 | 1.16 KB |
#6298 | SendReceive |
net6.0 | 1.46μs | 0.554ns | 2.14ns | 0.0161 | 0 | 0 | 1.14 KB |
#6298 | SendReceive |
netcoreapp3.1 | 1.82μs | 1.49ns | 5.77ns | 0.0154 | 0 | 0 | 1.14 KB |
#6298 | SendReceive |
net472 | 2.14μs | 0.847ns | 3.28ns | 0.183 | 0 | 0 | 1.16 KB |
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | EnrichedLog |
net6.0 | 2.95μs | 0.924ns | 3.46ns | 0.022 | 0 | 0 | 1.6 KB |
master | EnrichedLog |
netcoreapp3.1 | 3.85μs | 2.7ns | 10.5ns | 0.021 | 0 | 0 | 1.65 KB |
master | EnrichedLog |
net472 | 4.3μs | 1.44ns | 5.4ns | 0.323 | 0 | 0 | 2.04 KB |
#6298 | EnrichedLog |
net6.0 | 2.78μs | 2.15ns | 8.32ns | 0.0222 | 0 | 0 | 1.6 KB |
#6298 | EnrichedLog |
netcoreapp3.1 | 3.96μs | 1.04ns | 3.88ns | 0.0216 | 0 | 0 | 1.65 KB |
#6298 | EnrichedLog |
net472 | 4.51μs | 4.51ns | 17.5ns | 0.322 | 0 | 0 | 2.04 KB |
Benchmarks.Trace.SpanBenchmark - Slower ⚠️ Same allocations ✔️
Slower ⚠️ in #6298
Benchmark
diff/base
Base Median (ns)
Diff Median (ns)
Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑netcoreapp3.1
1.125
551.54
620.45
Faster 🎉 in #6298
Benchmark
base/diff
Base Median (ns)
Diff Median (ns)
Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0
1.215
493.56
406.34
Benchmark | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑netcoreapp3.1 | 1.125 | 551.54 | 620.45 |
Benchmark | base/diff | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0 | 1.215 | 493.56 | 406.34 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | StartFinishSpan |
net6.0 | 493ns | 0.527ns | 2.04ns | 0.00801 | 0 | 0 | 576 B |
master | StartFinishSpan |
netcoreapp3.1 | 552ns | 0.94ns | 3.64ns | 0.00779 | 0 | 0 | 576 B |
master | StartFinishSpan |
net472 | 636ns | 1.38ns | 5.33ns | 0.0917 | 0 | 0 | 578 B |
master | StartFinishScope |
net6.0 | 488ns | 0.365ns | 1.42ns | 0.00966 | 0 | 0 | 696 B |
master | StartFinishScope |
netcoreapp3.1 | 671ns | 0.68ns | 2.63ns | 0.00929 | 0 | 0 | 696 B |
master | StartFinishScope |
net472 | 971ns | 3.45ns | 13.4ns | 0.104 | 0 | 0 | 658 B |
#6298 | StartFinishSpan |
net6.0 | 406ns | 0.489ns | 1.89ns | 0.00807 | 0 | 0 | 576 B |
#6298 | StartFinishSpan |
netcoreapp3.1 | 620ns | 1.06ns | 4.11ns | 0.00781 | 0 | 0 | 576 B |
#6298 | StartFinishSpan |
net472 | 680ns | 0.569ns | 2.2ns | 0.0916 | 0 | 0 | 578 B |
#6298 | StartFinishScope |
net6.0 | 491ns | 0.864ns | 3.35ns | 0.00973 | 0 | 0 | 696 B |
#6298 | StartFinishScope |
netcoreapp3.1 | 668ns | 0.845ns | 3.27ns | 0.00945 | 0 | 0 | 696 B |
#6298 | StartFinishScope |
net472 | 938ns | 1.24ns | 4.81ns | 0.104 | 0 | 0 | 658 B |
Benchmarks.Trace.TraceAnnotationsBenchmark - Faster 🎉 Same allocations ✔️
Faster 🎉 in #6298
Benchmark
base/diff
Base Median (ns)
Diff Median (ns)
Modality
Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin‑net6.0
1.128
677.49
600.63
Benchmark | base/diff | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin‑net6.0 | 1.128 | 677.49 | 600.63 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | RunOnMethodBegin |
net6.0 | 678ns | 1.39ns | 5.37ns | 0.00992 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
netcoreapp3.1 | 966ns | 0.736ns | 2.85ns | 0.00918 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
net472 | 1.1μs | 2.28ns | 8.83ns | 0.104 | 0 | 0 | 658 B |
#6298 | RunOnMethodBegin |
net6.0 | 600ns | 1.33ns | 4.78ns | 0.00991 | 0 | 0 | 696 B |
#6298 | RunOnMethodBegin |
netcoreapp3.1 | 957ns | 1.45ns | 5.61ns | 0.00946 | 0 | 0 | 696 B |
#6298 | RunOnMethodBegin |
net472 | 1.21μs | 0.77ns | 2.98ns | 0.104 | 0 | 0 | 658 B |
a6a9f2a
to
1c78cc5
Compare
Summary of changes
Filter out the dead threads in the runtime metrics when using process snapshot.
Reason for change
We originally relied only on the count in the
PSS_THREAD_INFORMATION
instance returned byPssQuerySnapshot
. However, it does not only count threads that are alive, but also dead threads which handle isn't closed (the handle for dead managed threads is closed during finalization after a garbage collection, so they can hang around for a while, especially since threads tend to reach generation 2).This can be demonstrated by this simple code:
With the old code, this would display 58, even thought the task manager shows only 8 threads.
Implementation details
We now use
PssWalkSnapshot
to walk the snapshot and filter out the dead threads. I had to use a customFILETIME
struct to get the alignment correct in both 32/64 bits. In hindsight, this might have worked forPSS_PROCESS_INFORMATION
as well (avoiding the need to have separatePSS_PROCESS_INFORMATION_32
/PSS_PROCESS_INFORMATION_64
), I might try to improve that in a future PR.Test coverage
The runtime metrics are already covered by tests. We don't test for dead threads, but it seems hard to do in a reliable way (and I think we can all agree our CI is flaky enough at the moment).
Also, the existing test has a tolerance of 500 threads. Back then I couldn't explain why we got so much variation in the results, I guess now we have a reasonable explanation (it was changed in #5329, shortly after adding PSS support).
Because we add additional calls for each thread, it has an impact on performance so I ran some benchmarks.
With 50 threads (realistic):
With 250 threads (extreme):
We're 25 to 35% slower, but we're still orders of magnitude faster than using the .NET
Process
object.Other details
Fixes #6172