Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable tiered JIT workaround by default #3579

Merged
merged 3 commits into from
Dec 20, 2022

Conversation

andrewlock
Copy link
Member

Summary of changes

Switches the DD_INTERNAL_WORKAROUND_77973_ENABLED flag to be false by default

Reason for change

In #3506 we added a workaround for the runtime bug dotnet/runtime#77973, and enabled it by default for affected runtimes.

We noticed a significant impact on throughput tests (45% reduction in throughput) when tiered jit was disabled. Based on the documentation, it seems likely due to this:

Steady-State - If code loaded from ReadyToRun images appears hot, the runtime replaces it with jitted code which is typically higher quality. At runtime the JIT is able to observe the exact dependencies that are loaded as well as CPU instruction support which allows it to generate superior code. In the future it may also utilize profile guided feedback but it does not currently do so.

Implementation details

Switch the flag to false by default, so as to not impact customers by default. Customers can then make a reason decision whether to take the throughput hit to workaround the runtime bug dotnet/runtime#77973.

Test coverage

Enabled the flag in smoke tests and integration tests

Other details

It is preferable to use the DD_INTERNAL_WORKAROUND_77973_ENABLED flag as the workaround, instead of COMPlus_TieredCompilation, as that way we only disable tiered JIT on affected runtimes

@andrewlock andrewlock added the area:native-library Automatic instrumentation native C++ code (Datadog.Trace.ClrProfiler.Native) label Dec 20, 2022
@andrewlock andrewlock requested review from a team as code owners December 20, 2022 10:36
@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Dec 20, 2022

Datadog Report

Branch report: andrew/disable-tiered-jit-workaround-by-default
Commit report: ae551cd

dd-trace-dotnet 3 Failed (0 Known Flaky), 0 New Flaky, 222795 Passed, 837 Skipped, 19m 39.82s Wall Time

❌ Failed Tests (3)

  • MethodProbeTest - Datadog.Trace.Debugger.IntegrationTests.ProbesTests - Details

    Expand for error
     Results do not match.
     Differences:
     Received: ProbeTests.GenericRefReturnTest.received.txt
     Verified: ProbeTests.GenericRefReturnTest.verified.txt
     Received Content:
     [
       {
         "ddsource": "dd_debugger",
         "debugger": {
           "diagnostics": {
     ...
    
  • MethodProbeTest - Datadog.Trace.Debugger.IntegrationTests.ProbesTests - Details

    Expand for error
     Results do not match.
     Differences:
     Received: ProbeTests.GenericByRefLikeTest.received.txt
     Verified: ProbeTests.GenericByRefLikeTest.verified.txt
     Received Content:
     [
       {
         "ddsource": "dd_debugger",
         "debugger": {
           "diagnostics": {
     ...
    
  • SynchronizationContextGenericTest - Datadog.Trace.Tests.CallTarget.TaskContinuationGeneratorTests - Details

    Expand for error
     Assert.False() Failure
     Expected: False
     Actual:   True
    

@andrewlock
Copy link
Member Author

Benchmarks Report 🐌

Benchmarks for #3579 compared to master:

  • 1 benchmarks are faster, with geometric mean 1.130
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net472 762μs 635ns 2.38μs 0.381 0 0 3.22 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 544μs 681ns 2.64μs 0 0 0 2.62 KB
#3579 WriteAndFlushEnrichedTraces net472 752μs 809ns 3.13μs 0.377 0 0 3.22 KB
#3579 WriteAndFlushEnrichedTraces netcoreapp3.1 534μs 769ns 2.98μs 0 0 0 2.62 KB
Benchmarks.Trace.AppSecBodyBenchmark - Faster 🎉 Same allocations ✔️

Faster 🎉 in #3579

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.AppSecBodyBenchmark.AllCycleSimpleBody‑net472 1.130 25,631.26 22,672.70

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master AllCycleSimpleBody net472 25.6μs 15.9ns 61.5ns 0.276 0 0 1.77 KB
master AllCycleSimpleBody netcoreapp3.1 24.5μs 93.7ns 363ns 0.0216 0 0 1.64 KB
master AllCycleMoreComplexBody net472 194μs 23.4ns 87.5ns 2.05 0 0 13.02 KB
master AllCycleMoreComplexBody netcoreapp3.1 182μs 456ns 1.77μs 0.0887 0 0 12.1 KB
master BodyExtractorSimpleBody net472 279ns 0.183ns 0.71ns 0.0574 0 0 361 B
master BodyExtractorSimpleBody netcoreapp3.1 235ns 0.126ns 0.473ns 0.0038 0 0 272 B
master BodyExtractorMoreComplexBody net472 15.8μs 8.08ns 30.2ns 1.21 0.0159 0 7.62 KB
master BodyExtractorMoreComplexBody netcoreapp3.1 12.6μs 7.29ns 26.3ns 0.0882 0 0 6.75 KB
#3579 AllCycleSimpleBody net472 22.7μs 22.6ns 87.5ns 0.282 0 0 1.77 KB
#3579 AllCycleSimpleBody netcoreapp3.1 24.1μs 72.3ns 270ns 0.0214 0 0 1.64 KB
#3579 AllCycleMoreComplexBody net472 192μs 131ns 473ns 2.01 0 0 13.02 KB
#3579 AllCycleMoreComplexBody netcoreapp3.1 182μs 612ns 2.29μs 0.0912 0 0 12.1 KB
#3579 BodyExtractorSimpleBody net472 287ns 0.0757ns 0.293ns 0.0573 0 0 361 B
#3579 BodyExtractorSimpleBody netcoreapp3.1 237ns 0.119ns 0.46ns 0.00369 0 0 272 B
#3579 BodyExtractorMoreComplexBody net472 15.5μs 6.61ns 25.6ns 1.2 0.0156 0 7.62 KB
#3579 BodyExtractorMoreComplexBody netcoreapp3.1 12.8μs 6.77ns 26.2ns 0.0896 0 0 6.75 KB
Benchmarks.Trace.AspNetCoreBenchmark - Unknown 🤷 Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net472 0ns 0ns 0ns 0 0 0 0 b
master SendRequest netcoreapp3.1 178μs 179ns 693ns 0.269 0 0 20.39 KB
#3579 SendRequest net472 0ns 0ns 0ns 0 0 0 0 b
#3579 SendRequest netcoreapp3.1 177μs 139ns 540ns 0.264 0 0 20.39 KB
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net472 1.77μs 1.73ns 6.7ns 0.16 0.000876 0 1.01 KB
master ExecuteNonQuery netcoreapp3.1 1.43μs 1.17ns 4.54ns 0.0135 0 0 1 KB
#3579 ExecuteNonQuery net472 1.79μs 1.76ns 6.82ns 0.16 0.000904 0 1.01 KB
#3579 ExecuteNonQuery netcoreapp3.1 1.43μs 1.28ns 4.77ns 0.0136 0 0 1 KB
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net472 2.3μs 0.781ns 3.02ns 0.193 0 0 1.22 KB
master CallElasticsearch netcoreapp3.1 1.58μs 3.97ns 15.4ns 0.0156 0 0 1.16 KB
master CallElasticsearchAsync net472 2.61μs 0.909ns 3.52ns 0.215 0 0 1.36 KB
master CallElasticsearchAsync netcoreapp3.1 1.61μs 0.479ns 1.79ns 0.0177 0 0 1.28 KB
#3579 CallElasticsearch net472 2.5μs 1.33ns 4.98ns 0.193 0 0 1.22 KB
#3579 CallElasticsearch netcoreapp3.1 1.5μs 0.606ns 2.35ns 0.0159 0 0 1.16 KB
#3579 CallElasticsearchAsync net472 2.54μs 0.946ns 3.54ns 0.216 0 0 1.36 KB
#3579 CallElasticsearchAsync netcoreapp3.1 1.62μs 0.502ns 1.81ns 0.017 0 0 1.28 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net472 2.64μs 1.41ns 5.48ns 0.235 0 0 1.49 KB
master ExecuteAsync netcoreapp3.1 1.76μs 0.531ns 1.92ns 0.0184 0 0 1.41 KB
#3579 ExecuteAsync net472 2.77μs 0.977ns 3.78ns 0.235 0 0 1.49 KB
#3579 ExecuteAsync netcoreapp3.1 1.75μs 1.09ns 3.92ns 0.0195 0 0 1.41 KB
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net472 5.82μs 1.46ns 5.47ns 0.448 0 0 2.83 KB
master SendAsync netcoreapp3.1 3.6μs 2.04ns 7.62ns 0.0359 0 0 2.66 KB
#3579 SendAsync net472 5.84μs 1.47ns 5.5ns 0.45 0 0 2.83 KB
#3579 SendAsync netcoreapp3.1 3.73μs 8.34ns 30.1ns 0.0353 0 0 2.66 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net472 2.82μs 3.1ns 12ns 0.297 0 0 1.88 KB
master EnrichedLog netcoreapp3.1 2.21μs 0.881ns 3.41ns 0.0265 0 0 1.91 KB
#3579 EnrichedLog net472 2.79μs 1.87ns 7.01ns 0.297 0 0 1.88 KB
#3579 EnrichedLog netcoreapp3.1 2.3μs 7.55ns 29.3ns 0.0258 0 0 1.91 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net472 148μs 132ns 511ns 0.738 0.221 0 4.72 KB
master EnrichedLog netcoreapp3.1 120μs 208ns 777ns 0.0592 0 0 4.55 KB
#3579 EnrichedLog net472 148μs 109ns 423ns 0.738 0.221 0 4.72 KB
#3579 EnrichedLog netcoreapp3.1 118μs 259ns 1μs 0.0587 0 0 4.55 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net472 5.58μs 2.2ns 7.61ns 0.579 0.00277 0 3.65 KB
master EnrichedLog netcoreapp3.1 4.22μs 1.98ns 6.87ns 0.055 0 0 3.98 KB
#3579 EnrichedLog net472 5.61μs 1.45ns 5.44ns 0.578 0.0028 0 3.65 KB
#3579 EnrichedLog netcoreapp3.1 4.25μs 1.03ns 3.84ns 0.0536 0 0 3.98 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net472 2.26μs 2.3ns 8.6ns 0.227 0 0 1.44 KB
master SendReceive netcoreapp3.1 1.75μs 0.655ns 2.36ns 0.0191 0 0 1.38 KB
#3579 SendReceive net472 2.17μs 1.24ns 4.79ns 0.228 0 0 1.44 KB
#3579 SendReceive netcoreapp3.1 1.75μs 1.47ns 5.49ns 0.0185 0 0 1.38 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net472 4.6μs 1.66ns 6.23ns 0.363 0 0 2.3 KB
master EnrichedLog netcoreapp3.1 3.87μs 0.773ns 2.89ns 0.0252 0 0 1.86 KB
#3579 EnrichedLog net472 4.87μs 3.66ns 13.7ns 0.364 0 0 2.3 KB
#3579 EnrichedLog netcoreapp3.1 3.99μs 1.38ns 5.36ns 0.0261 0 0 1.86 KB
Benchmarks.Trace.SpanBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net472 1.16μs 0.117ns 0.423ns 0.139 0 0 875 B
master StartFinishSpan netcoreapp3.1 916ns 0.363ns 1.41ns 0.0114 0 0 824 B
master StartFinishScope net472 1.36μs 0.295ns 1.07ns 0.151 0 0 955 B
master StartFinishScope netcoreapp3.1 1.1μs 0.731ns 2.74ns 0.0126 0 0 944 B
#3579 StartFinishSpan net472 1.17μs 0.535ns 2.07ns 0.139 0 0 875 B
#3579 StartFinishSpan netcoreapp3.1 891ns 0.581ns 2.25ns 0.0111 0 0 824 B
#3579 StartFinishScope net472 1.29μs 0.577ns 2.16ns 0.151 0 0 955 B
#3579 StartFinishScope netcoreapp3.1 1.11μs 0.686ns 2.57ns 0.0128 0 0 944 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net472 1.45μs 0.719ns 2.78ns 0.152 0 0 955 B
master RunOnMethodBegin netcoreapp3.1 1.2μs 0.481ns 1.8ns 0.0129 0 0 944 B
#3579 RunOnMethodBegin net472 1.42μs 0.381ns 1.47ns 0.152 0 0 955 B
#3579 RunOnMethodBegin netcoreapp3.1 1.22μs 0.328ns 1.23ns 0.0127 0 0 944 B

@andrewlock andrewlock merged commit 4ec8df9 into master Dec 20, 2022
@andrewlock andrewlock deleted the andrew/disable-tiered-jit-workaround-by-default branch December 20, 2022 12:11
@github-actions github-actions bot added this to the vNext milestone Dec 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:native-library Automatic instrumentation native C++ code (Datadog.Trace.ClrProfiler.Native)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants