Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bulk of telemetry V2 implementation #4188

Merged
merged 8 commits into from
Jun 2, 2023
Merged

Conversation

andrewlock
Copy link
Member

@andrewlock andrewlock commented May 26, 2023

Summary of changes

Adds the bulk of the telemetry V2 implementation

  • Adds v2 versions of the telemetry DTOs (where they have changed from v1)
  • Adds v2 versions of collectors
  • Doesn't enable actually using the v2 implementation yet.

Reason for change

This is the bulk of the actual v2 telemetry implementation.

Implementation details

I mean... so many details... Probably best to review commit-by-commit 😬

Test coverage

Unit tests for most things, integration tests will follow in a subsequent PR when we actually enable it.

Other details

Stacked on #4187
Stacked on #4180

@andrewlock andrewlock requested a review from a team as a code owner May 26, 2023 15:15
@datadog-ddstaging
Copy link

datadog-ddstaging bot commented May 26, 2023

Datadog Report

Branch report: andrew/telemetry/v2
Commit report: bfc6e0f

❄️ dd-trace-dotnet: 0 Failed, 1 New Flaky, 280928 Passed, 913 Skipped, 40m 33.07s Wall Time

New Flaky Tests (1)

  • CheckSpanContextAreAttachedForCpuProfiler - Datadog.Profiler.IntegrationTests.CodeHotspot.CodeHotspotTest - Last Failure

    Expand for error
     Assert.NotEmpty() Failure
    

@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

@andrewlock andrewlock force-pushed the andrew/telemetry/more-v2-prep branch from 4d38c06 to f49cdda Compare May 30, 2023 08:58
@andrewlock andrewlock requested review from a team as code owners May 30, 2023 08:58
@andrewlock andrewlock force-pushed the andrew/telemetry/v2 branch from 198db0a to 9ae4ddb Compare May 30, 2023 08:58
@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

@andrewlock andrewlock force-pushed the andrew/telemetry/more-v2-prep branch from f49cdda to 006155e Compare May 30, 2023 11:45
@andrewlock andrewlock force-pushed the andrew/telemetry/v2 branch from 9ae4ddb to 9954709 Compare May 30, 2023 13:15
@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

Base automatically changed from andrew/telemetry/more-v2-prep to master May 30, 2023 15:26
@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

Copy link
Collaborator

@pierotibou pierotibou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and thanks for all the comments, that was helpful

@andrewlock andrewlock force-pushed the andrew/telemetry/v2 branch 2 times, most recently from 8842cc0 to f75ee53 Compare May 31, 2023 09:18
@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

@andrewlock andrewlock force-pushed the andrew/telemetry/v2 branch from f75ee53 to 12edb27 Compare May 31, 2023 14:35
@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

Also adds support for splitting dependency requests
Made the ConfigurationTelemetry object also be the "collector" object, that returns the telemetry data in the required format.

Record additional configuration values that aren't otherwise recorded. These could be recorded somewhere else, but TracerSettings.cs seems as good a place as any
Add additional required headers to requests (some of these were meant to be added in V1 too)
In V1, these are all in the same type, but I think it makes sense to separate them for easier testability
@andrewlock andrewlock force-pushed the andrew/telemetry/v2 branch from 12edb27 to bfc6e0f Compare June 1, 2023 08:20
@andrewlock
Copy link
Member Author

Benchmarks Report 🐌

Benchmarks for #4188 compared to master:

  • All benchmarks have the same speed
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

@andrewlock
Copy link
Member Author

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4188) - mean (3,028ms)  : 2927, 3130
     .   : milestone, 3028,
    master - mean (3,025ms)  : 2925, 3125
     .   : milestone, 3025,

    section CallTarget+Inlining+NGEN
    This PR (4188) - mean (3,767ms)  : 3715, 3819
     .   : milestone, 3767,
    master - mean (3,743ms)  : 3672, 3815
     .   : milestone, 3743,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4188) - mean (3,154ms)  : 3049, 3260
     .   : milestone, 3154,
    master - mean (3,148ms)  : 2995, 3301
     .   : milestone, 3148,

    section CallTarget+Inlining+NGEN
    This PR (4188) - mean (3,596ms)  : 3486, 3707
     .   : milestone, 3596,
    master - mean (3,598ms)  : 3534, 3662
     .   : milestone, 3598,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4188) - mean (3,143ms)  : 3036, 3250
     .   : milestone, 3143,
    master - mean (3,131ms)  : 3016, 3246
     .   : milestone, 3131,

    section CallTarget+Inlining+NGEN
    This PR (4188) - mean (3,569ms)  : 3517, 3622
     .   : milestone, 3569,
    master - mean (3,583ms)  : 3523, 3643
     .   : milestone, 3583,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4188) - mean (189ms)  : 185, 192
     .   : milestone, 189,
    master - mean (188ms)  : 184, 191
     .   : milestone, 188,

    section CallTarget+Inlining+NGEN
    This PR (4188) - mean (1,021ms)  : 962, 1081
     .   : milestone, 1021,
    master - mean (1,023ms)  : 960, 1085
     .   : milestone, 1023,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4188) - mean (369ms)  : 362, 376
     .   : milestone, 369,
    master - mean (367ms)  : 362, 371
     .   : milestone, 367,

    section CallTarget+Inlining+NGEN
    This PR (4188) - mean (1,133ms)  : 1100, 1165
     .   : milestone, 1133,
    master - mean (1,133ms)  : 1109, 1157
     .   : milestone, 1133,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4188) - mean (357ms)  : 351, 364
     .   : milestone, 357,
    master - mean (355ms)  : 351, 359
     .   : milestone, 355,

    section CallTarget+Inlining+NGEN
    This PR (4188) - mean (1,100ms)  : 1079, 1122
     .   : milestone, 1100,
    master - mean (1,100ms)  : 1061, 1138
     .   : milestone, 1100,

Loading

@andrewlock

This comment has been minimized.

@andrewlock
Copy link
Member Author

Throughput/Crank Report:zap:

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4188) (11.507M)   : 0, 11507459
    master (10.219M)   : 0, 10219339
    benchmarks/2.31.0 (11.338M)   : 0, 11338235
    benchmarks/2.9.0 (11.162M)   : 0, 11161538

    section Automatic
    This PR (4188) (8.066M)   : 0, 8066326
    master (7.156M)   : 0, 7155872
    benchmarks/2.31.0 (7.431M)   : 0, 7431225
    benchmarks/2.9.0 (8.099M)   : 0, 8099075

    section Trace stats
    master (7.149M)   : 0, 7148816
    benchmarks/2.31.0 (7.663M)   : 0, 7663176

    section Manual
    This PR (4188) (10.190M)   : 0, 10189635
    master (9.088M)   : 0, 9087972
    benchmarks/2.31.0 (7.351M)   : 0, 7350869

    section Manual + Automatic
    This PR (4188) (7.802M)   : 0, 7802267
    master (6.780M)   : 0, 6780456
    benchmarks/2.31.0 (5.741M)   : 0, 5741089

    section Version Conflict
    master (6.228M)   : 0, 6228238
    benchmarks/2.31.0 (6.913M)   : 0, 6913078

Loading
gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4188) (9.691M)   : 0, 9690581
    master (9.389M)   : 0, 9389432
    benchmarks/2.31.0 (9.679M)   : 0, 9678682
    benchmarks/2.9.0 (9.705M)   : 0, 9704781

    section Automatic
    This PR (4188) (6.719M)   : 0, 6718957
    master (6.921M)   : 0, 6920892
    benchmarks/2.31.0 (6.855M)   : 0, 6854574

    section Trace stats
    master (6.982M)   : 0, 6982075
    benchmarks/2.31.0 (6.924M)   : 0, 6924210

    section Manual
    This PR (4188) (8.540M)   : 0, 8539930
    master (8.479M)   : 0, 8479114
    benchmarks/2.31.0 (8.386M)   : 0, 8386139

    section Manual + Automatic
    This PR (4188) (6.531M)   : 0, 6530515
    master (6.645M)   : 0, 6644683
    benchmarks/2.31.0 (6.325M)   : 0, 6325089

    section Version Conflict
    master (6.046M)   : 0, 6045683
    benchmarks/2.31.0 (5.955M)   : 0, 5954718

Loading
gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4188) (10.398M)   : 0, 10398370
    master (10.296M)   : 0, 10296222
    benchmarks/2.31.0 (10.320M)   : 0, 10319575
    benchmarks/2.9.0 (9.827M)   : 0, 9827121

    section Automatic
    This PR (4188) (7.505M)   : 0, 7505364
    master (7.512M)   : 0, 7511831
    benchmarks/2.31.0 (7.410M)   : 0, 7410117
    benchmarks/2.9.0 (7.246M)   : 0, 7246397

    section Trace stats
    master (7.455M)   : 0, 7454826
    benchmarks/2.31.0 (7.559M)   : 0, 7559264

    section Manual
    This PR (4188) (9.185M)   : 0, 9185112
    master (9.074M)   : 0, 9074178
    benchmarks/2.31.0 (9.387M)   : 0, 9386685

    section Manual + Automatic
    This PR (4188) (7.238M)   : 0, 7237739
    master (7.175M)   : 0, 7174810
    benchmarks/2.31.0 (7.176M)   : 0, 7175876

    section Version Conflict
    master (6.581M)   : 0, 6580964
    benchmarks/2.31.0 (6.446M)   : 0, 6446354

Loading
gantt
    title Throughput Linux x64 (ASM) (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4188) (7.374M)   : 0, 7374306
    master (7.303M)   : 0, 7302577
    benchmarks/2.31.0 (7.475M)   : 0, 7475399
    benchmarks/2.9.0 (7.748M)   : 0, 7748029

    section No attack
    This PR (4188) (2.335M)   : 0, 2335401
    master (2.387M)   : 0, 2386831
    benchmarks/2.31.0 (2.389M)   : 0, 2388686
    benchmarks/2.9.0 (3.274M)   : 0, 3273699

    section Attack
    This PR (4188) (2.016M)   : 0, 2015885
    master (1.993M)   : 0, 1992843
    benchmarks/2.31.0 (2.039M)   : 0, 2038865
    benchmarks/2.9.0 (2.598M)   : 0, 2597950

    section Blocking
    This PR (4188) (4.173M)   : 0, 4173011
    master (4.016M)   : 0, 4015526
    benchmarks/2.31.0 (4.079M)   : 0, 4079419

Loading

Copy link
Contributor

@bouwkast bouwkast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall! I just had a few nitpick/curiosity questions

Tests looked good to me - some of the tests files were a bit large to follow (like tracer/test/Datadog.Trace.Tests/Telemetry/TelemetryControllerV2Tests.cs) but I don't have any suggestions and for the most part there was similar setup and the comments helped

const int maxPerMessage = 2000;
// Only 2000 dependencies should be included per message, so need to split into separate requests
var depsToSend = dependencies.Count > maxPerMessage
? dependencies.Skip(skip).Take(2000).ToList()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could 2000 be swapped for maxPerMessage here?

Suggested change
? dependencies.Skip(skip).Take(2000).ToList()
? dependencies.Skip(skip).Take(maxPerMessage).ToList()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, will fix in follow-up to avoid the CI hit!

@andrewlock andrewlock merged commit 89bc023 into master Jun 2, 2023
@andrewlock andrewlock deleted the andrew/telemetry/v2 branch June 2, 2023 08:27
@github-actions github-actions bot added this to the vNext milestone Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants