Skip to content

Conversation

@zacharycmontoya
Copy link
Contributor

@zacharycmontoya zacharycmontoya commented Nov 5, 2021

Issue

Our current WCF instrumentation (both CallSite and CallTarget) targets System.ServiceModel.Dispatcher.ChannelHandler.HandleRequest, which is very early in the callstack before a lot of the other WCF infrastructure is called, so any exceptions that propagate outside of the instrumentation will crash the server. For our instrumentation to create meaningful spans that are also connected to distributed parents, we must access the request message to gather information and create the span.

Unfortunately, in one known scenario, the first access to the request message should throw an exception that the WCF infrastructure should catch and it should cease processing of that request. Since we currently swallow exceptions during scope creation, the message continues to be processed and the application behavior is changed. Even if we decided to throw an exception, because we instrument a method before the main message processing loop, we would actually crash the server instead of stopping the processing of that one message. As a result, the only fix for this scenario is to defer our instrumentation until later in the WCF infrastructure code.

Since this is a big change and our WCF server testing is not as robust as other integrations, this change is being implemented by a feature flag.

Changes

  • In previous CallSite and CallTarget WCF instrumentation, check whether the WCF integration is enabled before accessing the MessageRequest object
  • Add a new CallTarget capability: We can now explicitly state which method parameters should be passed from the instrumented method into the OnMethodBegin callbacks. This allows us to now instrument methods that have ref/in/out parameters, as long as we bypass those specific parameters
    • This is required to instrument the desired methods for the types below
  • Add new WCF CallTarget instrumentation to the 3 built-in IOperationInvoker implementations, which is the last bit of WCF infrastructure code before the actual WCF services are invoked
    • SyncMethodInvoker
    • AsyncMethodInvoker
    • TaskMethodInvoker
  • Disable the previous CallTarget WCF instrumentation and enable the new CallTarget WCF instrumentation via opt-in environment variable DD_TRACE_DELAY_WCF_INSTRUMENTATION_ENABLED

Testing

  • Update Samples.Wcf test application to invoke all 3 built-in IOperationInvoker implementations: SyncMethodInvoker, AsyncMethodInvoker, and TaskMethodInvoker
    • Before, we were only testing SyncMethodInvoker
  • Update Samples.Wcf test application with a new custom binding that throws an exception on the first RequestMessage access, exposing the issue with our existing CallSite and CallTarget WCF instrumentation
  • Update WcfTests integration test to exercise all 4 bindings
  • Add snapshot testing to WcfTests

Work Remaining

The AsyncMethodInvoker (BeginXXX and EndXXX) does not work correctly with the built-in NetTcpBinding. When invoking the EndXXX method, the ambient Tracer scope returns null. In order to get this critical fix out more quickly, I propose deferring this work until we find a user that runs into this.

@DataDog/apm-dotnet

- Add a new Custom binding that will throw an exception when accessing the RequestMessage, similar to what we've seen in production
- Test all 4 bindings in the integration tests
- Update WcfTests to use snapshot testing
- Create test cases for the new WCF tracing mode, enabled by setting DD_TRACE_WCF_ENABLE_NEW_INSTRUMENTATION=true
…gumentsToLoad and TargetMethodArgumentsToLoadLength properties. This allows the OnBegin method to use a subset of the instrumented method's parameters, allowing us to instrument methods with ref arguments
…ENTATION setting, disable original WCF instrumentation when DD_TRACE_WCF_ENABLE_NEW_INSTRUMENTATION=true, and update tests/snapshots
…dInvoker and update snapshots. IMPORTANT: Does not work with TcpBinding; OnMethodEnd for the InvokeEnd operation does not detect ambient Scope
@zacharycmontoya zacharycmontoya added type:bug area:automatic-instrumentation Automatic instrumentation managed C# code (Datadog.Trace.ClrProfiler.Managed) area:call-target labels Nov 5, 2021
@zacharycmontoya zacharycmontoya requested a review from a team as a code owner November 5, 2021 03:39
@zacharycmontoya zacharycmontoya self-assigned this Nov 5, 2021
@andrewlock

This comment has been minimized.

Copy link
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! That looked like "fun" 😅 Minor comments, mostly for my understanding


auto numArguments = (int) methodArguments.size();
auto numArgsToLoad = argsToLoad.size();
bool useArgToLoad = numArgsToLoad > 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unlikely, but would we ever want to load 0 arguments (i.e. just access the instance)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible, so I added another field to allow us to disambiguate between 0 arguments and NOT use the feature in f9b0cf0

auto numArguments = (int) methodArguments.size();
auto numArgsToLoad = argsToLoad.size();
bool useArgToLoad = numArgsToLoad > 0;
auto numArguments = useArgToLoad ? (int) numArgsToLoad : (int) methodArguments.size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, just to clarify, changing this doesn't mean we accidentally select the wrong overload, it just means we don't pass those arguments to our instrumentation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's right in this part of the code we are already rewriting the method, not selecting it.

Comment on lines 38 to 44
var operationContextType = Type.GetType("System.ServiceModel.OperationContext, System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089", throwOnError: false);
if (operationContextType is not null)
{
var property = operationContextType.GetProperty("Current", BindingFlags.Public | BindingFlags.Static);
var method = property.GetGetMethod();
_getCurrentOperationContext = (Func<object>)method.CreateDelegate(typeof(Func<object>));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is the same (I think) in all the new integrations, is it worth moving to a helper, and using a generic helper with <TTarget> to cache it between all the instances)? We could also avoid the reflection entirely if the integration or the feature flag is disabled?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha @andrewlock also notice it... I agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I got too excited about posting the PR that I forgot to consolidate the methods. Done in c5987de

@lucaspimentel
Copy link
Member

Can we keep the DD_TRACE_<FEATURE>_ENABLED convention? Thanks!

Copy link
Member

@tonyredondo tonyredondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, just a couple of comments.


HRESULT WriteBeginMethod(void* rewriterWrapperPtr, mdTypeRef integrationTypeRef, const TypeInfo* currentType,
std::vector<FunctionMethodArgument>& methodArguments, ILInstr** instruction);
std::vector<FunctionMethodArgument>& methodArguments, std::vector<USHORT>& argsToLoad,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe the new vector can be passed as a const

Copy link
Contributor Author

@zacharycmontoya zacharycmontoya Nov 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 75f06c9

Comment on lines 1278 to 1283
MethodReplacement(
{},
MethodReference(targetAssembly, targetType, targetMethod, EmptyWStr, minVersion, maxVersion,
{}, signatureTypes),
{}, signatureTypes, targetMethodArgumentsToLoad),
MethodReference(wrapperAssembly, wrapperType, EmptyWStr, calltarget_modification_action, {}, {}, {},
{})));
{}, {})));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm checking this, porting this fix to release/2.0 will be a rewrite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah...I can do the rewrite for 2.0

Comment on lines 35 to 44
static AsyncMethodInvoker_InvokeEnd_Integration()
{
var operationContextType = Type.GetType("System.ServiceModel.OperationContext, System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089", throwOnError: false);
if (operationContextType is not null)
{
var property = operationContextType.GetProperty("Current", BindingFlags.Public | BindingFlags.Static);
var method = property.GetGetMethod();
_getCurrentOperationContext = (Func<object>)method.CreateDelegate(typeof(Func<object>));
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same as the integration above? (AsyncMethodInvoker_InvokeBegin_Integration)
Maybe having an Integration common class so we are not creating two delegates for the same thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's the same. I got so excited that everything was working that I posted the PR without consolidating these into a helper class. Done in c5987de

Comment on lines 37 to 46
static SyncMethodInvokerIntegration()
{
var operationContextType = Type.GetType("System.ServiceModel.OperationContext, System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089", throwOnError: false);
if (operationContextType is not null)
{
var property = operationContextType.GetProperty("Current", BindingFlags.Public | BindingFlags.Static);
var method = property.GetGetMethod();
_getCurrentOperationContext = (Func<object>)method.CreateDelegate(typeof(Func<object>));
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, fixed in c5987de

Comment on lines 38 to 44
var operationContextType = Type.GetType("System.ServiceModel.OperationContext, System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089", throwOnError: false);
if (operationContextType is not null)
{
var property = operationContextType.GetProperty("Current", BindingFlags.Public | BindingFlags.Static);
var method = property.GetGetMethod();
_getCurrentOperationContext = (Func<object>)method.CreateDelegate(typeof(Func<object>));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha @andrewlock also notice it... I agree.

new("System.Data.SQLite", "System.Data.SQLite.SQLiteCommand", "ExecuteReader", new[] { "System.Data.SQLite.SQLiteDataReader", "System.Data.CommandBehavior" }, 1, 0, 0, 2, 65535, 65535, assemblyFullName, "Datadog.Trace.ClrProfiler.AutoInstrumentation.AdoNet.CommandExecuteReaderWithBehaviorIntegration"),
new("System.Data.SQLite", "System.Data.SQLite.SQLiteCommand", "ExecuteScalar", new[] { "System.Object" }, 1, 0, 0, 2, 65535, 65535, assemblyFullName, "Datadog.Trace.ClrProfiler.AutoInstrumentation.AdoNet.CommandExecuteScalarIntegration"),
new("System.Data.SQLite", "System.Data.SQLite.SQLiteCommand", "ExecuteScalar", new[] { "System.Object", "System.Data.CommandBehavior" }, 1, 0, 0, 2, 65535, 65535, assemblyFullName, "Datadog.Trace.ClrProfiler.AutoInstrumentation.AdoNet.CommandExecuteScalarWithBehaviorIntegration"),
new("Microsoft.Data.SqlClient", "Microsoft.Data.SqlClient.SqlCommand", "ExecuteDbDataReader", new[] { "System.Data.Common.DbDataReader", "System.Data.CommandBehavior" }, new ushort[] { }, 1, 0, 0, 3, 65535, 65535, assemblyFullName, "Datadog.Trace.ClrProfiler.AutoInstrumentation.AdoNet.CommandExecuteReaderWithBehaviorIntegration"),
Copy link
Member

@tonyredondo tonyredondo Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should accept null values instead of new ushort[] { }, this seems like a waste of allocations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would also play into the case of where we don't need to pass any arguments to the integration as I mentioned - an empty array would mean "no arguments", as opposed to null which would mean "all arguments"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed both issues in f9b0cf0 . null results in us not using the feature, and an empty array will allow us to use the feature by passing zero arguments to the instrumentation method

Comment on lines +81 to +92
TargetMethodArgumentsToLoad = IntPtr.Zero;
if (targetMethodArgumentsToLoad?.Length > 0)
{
// The Marshal operations only operate on short not ushort (CLS-compliance)
TargetMethodArgumentsToLoad = Marshal.AllocHGlobal(targetMethodArgumentsToLoad.Length * Marshal.SizeOf(typeof(short)));
var ptr = TargetMethodArgumentsToLoad;
for (var i = 0; i < targetMethodArgumentsToLoad.Length; i++)
{
Marshal.WriteInt16(ptr, (short)targetMethodArgumentsToLoad[i]);
ptr += Marshal.SizeOf(typeof(short));
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should also do the disposal of the array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 0992cfb . Does that look fine to you?

…the WcfCommon.cs class (intentionally new file to avoid rename conflicts with existing WcfIntegration.cs)

- Populate the GetCurrentOperationContext delegate lazily
- Check that the WCF integration is enabled and the feature flag is enabled FIRST, before checking whether the GetCurrentOperationContext is non-null
…NTATION_ENABLED

- Move the DelayWcfInstrumentationEnabled to the FeatureFlags grouping of ConfigurationKeys
- Make the DelayWcfInstrumentationEnabled TracerSetting internal
@zacharycmontoya
Copy link
Contributor Author

zacharycmontoya commented Nov 11, 2021

Can we keep the DD_TRACE_<FEATURE>_ENABLED convention? Thanks!

Sure thing! This was changed to DD_TRACE_DELAY_WCF_INSTRUMENTATION_ENABLED in e75305a

@andrewlock

This comment has been minimized.

…ation definitions that do not override the TargetMethodArgumentsToLoad array

- Add additional bool field in NativeCallTargetDefinition to disambiguate between "Don't use the TargetMethodArgumentsToLoad feature" and "Use the feature but don't pass any arguments"
…sses into a Bindings/Custom folder, and suppress CS0067 build warning

public ushort TargetSignatureTypesLength;

[MarshalAs(UnmanagedType.U1)]
Copy link
Member

@tonyredondo tonyredondo Nov 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool vs BOOL in native side :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fine or do I need to make a change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfectly fine!, you don't need to modify anything. I was just pointing out this line is required because you used bool instead of BOOL in the native struct.

Copy link
Member

@tonyredondo tonyredondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

zacharycmontoya added a commit that referenced this pull request Nov 12, 2021
Commit SHA at time of cherry-pick: 9ccf9dd Fix the native Windows tests

Additional changes include:
- Move CreateScope method from ChannelHandlerIntegration.cs to WcfCommon.cs
@andrewlock

This comment has been minimized.

…x instead of 20s max for spans to reach the MockTracerAgent, in hopes that it fixes the flakiness in the snapshot tests where some snapshots don't have all expected spans.
…s flaky in CI where it does not consistently emit the server spans. Reset the snapshots to exclude the Async spans for now.

Note: This didn't affect the NetTcpBinding with the new instrumentation because the Async endpoint wasn't working, and it didn't affect the CustomBinding because that doesn't return any spans.
@andrewlock
Copy link
Member

Code Coverage Report 📊

✔️ Merging #1992 into master will not change line coverage
✔️ Merging #1992 into master will not change branch coverage
⛔ Merging #1992 into master will will increase complexity by 44

master #1992 Change
Lines 10465 / 15639 10502 / 15686
Lines % 67% 67% 0% ✔️
Branches 4568 / 6894 4592 / 6934
Branches % 66% 66% 0% ✔️
Complexity 8101 8145 44

View the full report for further details:

Datadog.Trace Breakdown ✔️

master #1992 Change
Lines % 67% 67% 0% ✔️
Branches % 66% 66% 0% ✔️
Complexity 8101 8145 44

The following classes have significant coverage changes.

File Line coverage change Branch coverage change Complexity change
Datadog.Trace.ClrProfiler.AutoInstrumentation.Testing.NUnit.NUnitCompositeWorkItemSkipChildrenIntegration -8% -8% 0 ✔️
Datadog.Trace.ClrProfiler.Integrations.Testing.NUnitIntegration -2% ⚠️ -14% 0 ✔️
Datadog.Trace.ClrProfiler.AutoInstrumentation.Http.WebRequest.HttpWebRequest_BeginGetRequestStream_Integration 14% ✔️ 20% ✔️ 0 ✔️
Datadog.Trace.ClrProfiler.AutoInstrumentation.Http.WebRequest.HttpWebRequest_EndGetResponse_Integration 29% ✔️ 23% ✔️ 0 ✔️

The following classes were added in #1992:

File Line coverage Branch coverage Complexity
Datadog.Trace.ClrProfiler.AutoInstrumentation.Wcf.AsyncMethodInvoker_InvokeBegin_Integration 0% 0% 8
Datadog.Trace.ClrProfiler.AutoInstrumentation.Wcf.AsyncMethodInvoker_InvokeEnd_Integration 0% 100% 1
Datadog.Trace.ClrProfiler.AutoInstrumentation.Wcf.SyncMethodInvokerIntegration 100% 88% 9
Datadog.Trace.ClrProfiler.AutoInstrumentation.Wcf.TaskMethodInvokerIntegration 100% 88% 9
Datadog.Trace.ClrProfiler.AutoInstrumentation.Wcf.WcfCommon 89% 50% 4

View the full reports for further details:

@andrewlock
Copy link
Member

Benchmarks Report 🐌

Benchmarks for #1992 compared to master:

  • All benchmarks have the same speed
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net472 939.1 μs 14.06 μs 13.15 μs 0.0000 0.0000 0.0000 3.09 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 736.4 μs 10.36 μs 14.52 μs 0.0000 0.0000 0.0000 2.51 KB
#1992 WriteAndFlushEnrichedTraces net472 933.4 μs 14.52 μs 14.91 μs 0.0000 0.0000 0.0000 3.09 KB
#1992 WriteAndFlushEnrichedTraces netcoreapp3.1 773.1 μs 14.20 μs 13.28 μs 0.0000 0.0000 0.0000 2.51 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net472 0.0000 ns 0.0000 ns 0.0000 ns 0.0000 0.0000 0.0000 0 B
master SendRequest netcoreapp3.1 300,664.4450 ns 6,008.9434 ns 7,599.3894 ns 0.1502 0.0000 0.0000 19832 B
#1992 SendRequest net472 0.0101 ns 0.0205 ns 0.0400 ns 0.0000 0.0000 0.0000 0 B
#1992 SendRequest netcoreapp3.1 308,993.0946 ns 4,474.1193 ns 4,185.0940 ns 0.1598 0.0000 0.0000 19832 B
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net472 2.070 μs 0.0326 μs 0.0305 μs 0.1074 0.0011 0.0000 682 B
master ExecuteNonQuery netcoreapp3.1 1.993 μs 0.0278 μs 0.0309 μs 0.0101 0.0000 0.0000 712 B
#1992 ExecuteNonQuery net472 2.043 μs 0.0390 μs 0.0365 μs 0.1076 0.0010 0.0000 682 B
#1992 ExecuteNonQuery netcoreapp3.1 1.952 μs 0.0355 μs 0.0332 μs 0.0098 0.0000 0.0000 712 B
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net472 3.127 μs 0.0477 μs 0.0398 μs 0.1282 0.0000 0.0000 827 B
master CallElasticsearch netcoreapp3.1 1.691 μs 0.0337 μs 0.0299 μs 0.0108 0.0000 0.0000 768 B
master CallElasticsearchAsync net472 3.339 μs 0.0605 μs 0.0536 μs 0.1504 0.0000 0.0000 963 B
master CallElasticsearchAsync netcoreapp3.1 1.777 μs 0.0244 μs 0.0217 μs 0.0126 0.0000 0.0000 888 B
#1992 CallElasticsearch net472 3.207 μs 0.0617 μs 0.0606 μs 0.1282 0.0000 0.0000 827 B
#1992 CallElasticsearch netcoreapp3.1 1.713 μs 0.0171 μs 0.0143 μs 0.0111 0.0000 0.0000 768 B
#1992 CallElasticsearchAsync net472 3.346 μs 0.0348 μs 0.0291 μs 0.1499 0.0000 0.0000 963 B
#1992 CallElasticsearchAsync netcoreapp3.1 1.782 μs 0.0275 μs 0.0258 μs 0.0122 0.0000 0.0000 888 B
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net472 3.539 μs 0.0669 μs 0.0626 μs 0.1679 0.0000 0.0000 1083 B
master ExecuteAsync netcoreapp3.1 1.860 μs 0.0232 μs 0.0193 μs 0.0136 0.0000 0.0000 1008 B
#1992 ExecuteAsync net472 3.643 μs 0.0415 μs 0.0388 μs 0.1685 0.0000 0.0000 1083 B
#1992 ExecuteAsync netcoreapp3.1 2.019 μs 0.0368 μs 0.0616 μs 0.0139 0.0000 0.0000 1008 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net472 6.312 μs 0.1261 μs 0.2107 μs 0.3503 0.0000 0.0000 2.19 KB
master SendAsync netcoreapp3.1 4.568 μs 0.0802 μs 0.1225 μs 0.0293 0.0000 0.0000 2.09 KB
#1992 SendAsync net472 6.038 μs 0.1056 μs 0.0987 μs 0.3493 0.0000 0.0000 2.19 KB
#1992 SendAsync netcoreapp3.1 4.339 μs 0.0516 μs 0.0457 μs 0.0307 0.0000 0.0000 2.09 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net472 4.020 μs 0.0676 μs 0.0632 μs 0.2869 0.0000 0.0000 1.79 KB
master EnrichedLog netcoreapp3.1 4.063 μs 0.0784 μs 0.1453 μs 0.0282 0.0000 0.0000 1.94 KB
#1992 EnrichedLog net472 3.986 μs 0.0666 μs 0.0623 μs 0.2866 0.0000 0.0000 1.79 KB
#1992 EnrichedLog netcoreapp3.1 3.990 μs 0.0794 μs 0.1903 μs 0.0284 0.0000 0.0000 1.94 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net472 184.3 μs 3.67 μs 9.34 μs 0.6286 0.1796 0.0000 5.23 KB
master EnrichedLog netcoreapp3.1 147.5 μs 2.72 μs 2.41 μs 0.0718 0.0000 0.0000 5.05 KB
#1992 EnrichedLog net472 183.6 μs 3.47 μs 3.86 μs 0.7082 0.2656 0.0000 5.23 KB
#1992 EnrichedLog netcoreapp3.1 148.8 μs 2.82 μs 2.77 μs 0.0760 0.0000 0.0000 5.05 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net472 11.38 μs 0.152 μs 0.142 μs 0.8514 0.0000 0.0000 5.31 KB
master EnrichedLog netcoreapp3.1 10.22 μs 0.196 μs 0.218 μs 0.0886 0.0000 0.0000 6.28 KB
#1992 EnrichedLog net472 12.05 μs 0.231 μs 0.227 μs 0.8548 0.0000 0.0000 5.31 KB
#1992 EnrichedLog netcoreapp3.1 10.42 μs 0.206 μs 0.193 μs 0.0887 0.0000 0.0000 6.28 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net472 2.145 μs 0.0400 μs 0.0374 μs 0.1545 0.0000 0.0000 987 B
master SendReceive netcoreapp3.1 1.956 μs 0.0356 μs 0.0333 μs 0.0134 0.0000 0.0000 984 B
#1992 SendReceive net472 2.184 μs 0.0259 μs 0.0230 μs 0.1544 0.0000 0.0000 987 B
#1992 SendReceive netcoreapp3.1 2.012 μs 0.0235 μs 0.0196 μs 0.0141 0.0000 0.0000 984 B
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net472 7.751 μs 0.0965 μs 0.0805 μs 0.4483 0.0000 0.0000 2.8 KB
master EnrichedLog netcoreapp3.1 7.378 μs 0.0774 μs 0.0646 μs 0.0365 0.0000 0.0000 2.61 KB
#1992 EnrichedLog net472 8.197 μs 0.1557 μs 0.1380 μs 0.4462 0.0000 0.0000 2.8 KB
#1992 EnrichedLog netcoreapp3.1 7.649 μs 0.1492 μs 0.2652 μs 0.0373 0.0000 0.0000 2.61 KB
Benchmarks.Trace.SpanBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net472 879.6 ns 8.94 ns 8.37 ns 0.0682 0.0000 0.0000 433 B
master StartFinishSpan netcoreapp3.1 928.3 ns 18.30 ns 36.97 ns 0.0060 0.0000 0.0000 432 B
master StartFinishScope net472 1,089.8 ns 11.48 ns 10.74 ns 0.0807 0.0000 0.0000 514 B
master StartFinishScope netcoreapp3.1 1,062.5 ns 21.11 ns 40.66 ns 0.0076 0.0000 0.0000 552 B
#1992 StartFinishSpan net472 910.1 ns 14.29 ns 13.37 ns 0.0680 0.0000 0.0000 433 B
#1992 StartFinishSpan netcoreapp3.1 895.7 ns 17.91 ns 16.75 ns 0.0062 0.0000 0.0000 432 B
#1992 StartFinishScope net472 1,057.4 ns 17.42 ns 16.30 ns 0.0806 0.0000 0.0000 514 B
#1992 StartFinishScope netcoreapp3.1 1,131.9 ns 22.66 ns 66.11 ns 0.0079 0.0000 0.0000 552 B

@pierotibou pierotibou merged commit d009bf0 into master Nov 12, 2021
@pierotibou pierotibou deleted the zach/wcf-request-message-error branch November 12, 2021 12:30
@github-actions github-actions bot added this to the vNext milestone Nov 12, 2021
zacharycmontoya added a commit that referenced this pull request Nov 15, 2021
Additional changes include:
- Move CreateScope method from ChannelHandlerIntegration.cs to WcfCommon.cs
andrewlock added a commit that referenced this pull request Nov 17, 2021
andrewlock added a commit that referenced this pull request Nov 18, 2021
andrewlock added a commit that referenced this pull request Nov 18, 2021
andrewlock added a commit that referenced this pull request Nov 19, 2021
zacharycmontoya added a commit that referenced this pull request Nov 22, 2021
Additional changes include:
- Move CreateScope method from ChannelHandlerIntegration.cs to WcfCommon.cs
zacharycmontoya added a commit that referenced this pull request Nov 23, 2021
zacharycmontoya added a commit that referenced this pull request Nov 23, 2021
…nment variable (#1992)" (#2081)

This reverts commit d009bf0.

The previous commit introduced a change to the CallTarget definition structure that passes integration information to the profiler, but this can cause issues if the profiler and the NuGet package versions mismatch.
zacharycmontoya added a commit that referenced this pull request Nov 29, 2021
Additional changes include:
- Move CreateScope method from ChannelHandlerIntegration.cs to WcfCommon.cs

Strip out all CallTarget changes and use the new CallTarget ByRef instrumentation to instrument WCF methods which have out parameters (thanks Tony!)
zacharycmontoya added a commit that referenced this pull request Nov 30, 2021
…ent variable (#2041)

* Port changes from #1992 into the release/2.0 branch.

Additional changes include:
- Move CreateScope method from ChannelHandlerIntegration.cs to WcfCommon.cs
- Instead of breaking the CallTarget native structures (causing the original PR to be reverted for 1.30.1), use the new CallTarget ByRef instrumentation to instrument WCF methods which have out parameters (thanks Tony!)
- Remove testing of CallSite codepaths in WcfTests
Kielek added a commit to signalfx/signalfx-dotnet-tracing that referenced this pull request Dec 9, 2021
* Upstream sync 4153faa..7fa9a93

* Don't throw in AgentWriterBenchmark (#2057)

Currently we're throwing every time we flush in the `AgentWriterBenchmark`. That's obviously very expensive, variable, and makes for difficult comparisons.

* Remove unused and defunct throughput test (#1802)

* [AppSec] Waf fixes: reading register for waf and unit tests for alpine (#2060)

* Remove musl suffix in TryLoadLibrary too
* Don't read memory if no reg key could be read
* typo fix
* Remove logging info
* simplify filter switch

* Remove dependency from the shared project to main project (#2043)

By including the dllmain.h file, we add a dependency between
the shared project that contains the loader and the main project
(today, the profiler C++ project)

__ImageBase is a linker (Windows) trick that allows us to get
the HINSTANCE and to remove the dependency

* Revert "Add modified WCF CallTarget instrumentation via opt-in environment variable (#1992)"

This reverts commit d009bf0.

* [2.0] - Remove CallSite instrumentation (#1958)

* Initial removal POC

* first commit to remove .net 4.5 support

* changes.

* change rejithandler pointer to smart pointer

* changes

* Fixes and refactoring.

* Add stats from the beginning to the end of a process

* Remove unused clr_helpers functions.

* Changes

* Changes

* Initial Remove of CallSite integrations.

* Remove testing callsite instrumentation

* Remove ADO.NET CallSite integrations

* Changes

* fix test

* fix test

* fix test

* fix

* remove integrations.json references

* fix build

* fix test

* fix windows installer

* changes

* fix native tests.

* fix merge

* Remove integrations.json references.

* -Remove integrations.json references
-Fix Tests
-CleanUp

* Fixes

* MSI installer fix

* Improve cor profiler

* Removes Datadog.Trace.ClrProfiler.Integrations namespace

* Fix CallTargetNativeTest without relying on integrations.json file.

* Upgrade ICorProfilerInfo6 to ICorProfilerInfo7.

* remove unnecessary IntegrationName field.

* Remove custom member data on tests.

* - Enable NGEN by default in containers.
- Fix Crank NGEN settings.

* [2.0] - Native stats fixes (#1967)

* Fixes a possible crash if Stats::ToString() is called more than once.

* fix algorithm on multiple Refresh() calls.

* [2.0] Additional cleanup from CallSite removal (#1966)

* Remove remaining CallSite references from PrepareRelease and Dependabot file manager
* Delete CallSite snapshots
* Remove .CallTarget from snapshot file names (everything is calltarget)
* Update expected snapshot names
* Inline now-constant booleans
* Remove references to DD_TRACE_NETSTANDARD_ENABLED and DD_TRACE_CALLTARGET_ENABLED - These keys doesn't do anything any more.
* Remove unused "enableCallTarget" parameter

This reverts commit 662c177ac6a96b20d18f2642e98635beb7d76220.

* [2.0] Drop support for .NET < 4.6.1 (#1957)

* Drop support for .NET < 4.6.1

* Switch header collections to struct (#1971)

* Switch header collections to struct

* Convert more collections and remove unused one

* [2.0] Mark deprecated settings obsolete (#1974)

* Mark DD_TRACE_ANALYTICS_ENABLED obsolete

App Analytics was replaced by Tracing without Limits last year.

* Mark DD_TRACE_LOG_PATH as deprecated

Prefer DD_TRACE_LOG_DIRECTORY instead

* Update public api

* [2.0] Remove obsolete APIs (#1952)

* Remove obsolete APIs

* Add missing namespace

* [2.0] - Refactor Integration struct, and remove unused code. (#1969)

* Refactor Native Integration struct.

* Remove unused code and refactor remaining for calltarget only.

* Rename all WrapperType references to IntegrationType

* Rename InegrationMethods to IntegrationDefinitions

* [2.0] - Adds a simple agent port checker on CI Visibility scenarios. (#1972)

* Adds a simple agent port checker on CI Visibility scenarios.

* Use AgentWriter.Ping to do the check.

* Restore target framework to netcoreapp3.1

* Address review comments.

* Write to stdout only on error.

* [2.0] Update default value of DD_TRACE_ROUTE_TEMPLATE_RESOURCE_NAMES_ENABLED to true (#1961)

* Update default value of DD_TRACE_ROUTE_TEMPLATE_RESOURCE_NAMES_ENABLED to true

* Update ASP.NET Core test snapshots

* Update ASP.NET test snapshots

* Update Datadog.Security.IntegrationTests to work with the new default AspNetCore spans

* Remove unused files (#1985)

* [2.0] Cleanup old factory method and tests (#1980)

* Delete old "Create" method

* Delete old tests that were never run, and wouldn't pass if they were

These are superseded by proper integration tests anyway

* Fix broken OpenTracing.IntegrationTests

These were never run, and wouldn't have passed if they were. Given this is the only testing of the OpenTracing agent, converted these to use the MockTracerAgent instead of removing

* Remove unused handlers and methods

* Update public API

* Remove commented out code

* Remove unnescessary copies. (#1989)

* [2.0] Make various APIs internal (#1968)

* Made Tags and SpanTypes associated specifically with our automatic instrumentation internal

* Mark duck types for integrations as not browsable

Also mark Azure functions integration as not browsable

* Make duck typing internal

The DuckType itself had to stay Public, as it's invoked by the proxies themselves for converting types

* Hide IgnoresAccessChecksToAttribute

* Update public API

Simplify test, because the API is now consistent between framework versions

* Make some more tags internal

* [2.0] Cleanup some more obsolete usages (#1997)

* Add workaround for "Unable move and reuse existing repository to required location."

* Delete EntityFramework6xMdTokenLookupFailure as this tested a callsite-specific code path

* Stop setting the obsolete AnalyticsEnabled in test apps

* [2.0] Add support for .NET 6 (#1885)

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>

* Update gitlab build to use .NET 6 SDK (#2010)

* [2.0] Fix ASP.NET Core tests + snapshots (#2012)

* Fix ASP.NET Core integration tests should disable new setting as well as enable it

* Revert snapshot changes introduced in #1961

These _shouldn't_ have changed, the fact they did shows we did something wrong

* Fix owin integration tests

* Set AspNetMvc flag correctly

* Fix WebApi2 tests

* [2.0] Handle multiple tracer instances gracefully (#1990)

* Add TracerManager for managing the lifetime of "inner" tracer objects

The Tracer instance now delegates most of its behaviour to the TracerManager, and instead primarily becomes a "helper" on top of IScopeManager.

This changes the "main" API for users from being the Tracer constructor, to the static Tracer.ReplaceGlobalSettings(). This makes the real behaviour clear. We maintain the existing constructor for compatibility, but mark it obsolete.

The TracerManager is responsible for cleaning up resources when we replace global agentwriter/sampler etc. Currently it will _always_ replace the services, though we could make this smarter and not replace them if nothing has changed (which would reduce some overhead on startup for users that just setting a couple of things in code, like the service name)

Note that we now create a singleton IScopeManager, as logs injection with lib log can not support multiple concurrent traces.

Had to refactor Api.cs so as to not rely on Tracer.Instance.

This commit temporarily removes the LibLog integration, it will be added back in the next commit

* Update LibLogScopeEventSubscriber implementation

It doesn't actually require a tracer instance, it just requires a scope manager and the Tracing Without limits properties. This correctly creates a new lib log instance if settings change, and disposes the old one.

* Update tests and test apps

- Add a helper method that avoids the old constructors (which replace the global tracer)
- Replace some usages of the old APIs
- Replace usages of Tracer.Instance that cause issues in CI otherwise
- Fix snapshots

* Update SpanExtensions to explicitly pass a tracer instance

In the diagnostic observer (for example) this previously resulted in creating a separate Tracer instance (in practical terms, only during tests, but better to be explicit I think)

* #pragma ignore uses of obsolete Tracer.Instance set in benchmarks

* Move cleanup to a separate Task as it requires async work

* Fix cleanup of old TracerManager to be proper fire and forget

* Replace usages of obsolete API in benchmarks

* Pass TracerSettings into SpanExtensions.SetHttpCode instead of Tracer

* Use a callback to update sample rates instead of an ISampler instance

* Fix sampling rate test

* Feedback from PR

* fixes to previous commit

* Rename ReplaceGlobalSettings() -> Configure() (#2023)

* fold `IntegrationIds.ElasticsearchNet5` into `ElasticsearchNet` (#1983)

* [2.0] Use buster-based debian for building (#2032)

* Revert debian dockerfile to use buster instead of bullseye

* Update build_in_docker sdk version

* [2.0] Add .NET 6 WebApplicationBuilder tests (#2016)

* Add .NET 6 minimal APIs sample project

* Add integration tests for .NET 6 minimal APIs

Note that the NoFF /api/delay/0 tests have a different resource name than the equivalent ApiController version:  `GET /api/delay/?` instead of `GET api/delay/{seconds}`. This is because we don't run the endpoint routing events, so we end up using the "default" resource names, which don't include the route templates, even though they're technically available.

* Add snapshots for .net 6 webapplication/minimal api tests

* Add explicit DiagnosticObserver tests for .NET 6's WebApplicationBuilder

The most annoying thing here is how WebApplication adds an implicit UseRouting call at the start of the pipeline (before our existing middleware). That messes up the `UseMultipleErrorHandlerPipelines()` helper we had for testing multiple types of pipeline re-execution.

The upshot is that to handle the same paths, we need to build the pipeline in slightly different ways. This is "fine", just a bit annoying. I couldn't see a good alternative way around it, other than separating the executions to entirely different apps (but didn't want to that refactor here)

* Don't try and run diff tool inside CI

* [2.0] Minor restructuring of logging files (#2040)

* Move logging files to separate folders

To make it clear what's related to "internal logging" and what's related to "logs injection"

Keep the namespaces the same for now

* Inline unused static variable

* [2.0] - Detect current module at runtime and use it for PInvoke rewriting (#2044)

* Initial current module function for windows.

* Adds linux/osx version of the function

* Use current module filepath for PInvoke rewrites.

* Removing unnecessary code.

* Adds validation on Initialization.

* Add GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT flag.

* [2.0] Various small DiagnosticObserver optimisations (#2024)

* Make EndpointFeatureProxy a virtual class instead of an interface as slightly faster

* Use request.PathBase.Value not request.PathBase

Otherwise we fall back to string.Format()

* Use separate ASP.NET Core tags implementations

These more correctly represent the tags we can (and want) to set on the inner tags

* Calculate "obfuscated" urls on request end, not request start

We always throw away the calculation _except_ in error (404) cases or where middleware returns the response. By calculating it on exit, we avoid wasted work.

I'm not entirely clear of the Azure Functions behaviour here, so kept existing behaviour for that.

Also minor refactor to move the integrationsEnabled check outside of AspNetCoreHttpRequestHandler so we don't check it twice

* Remove unneeded properties from RequestTrackingFeature

We record these on the tags of the initial span anyway, so no need to duplicate

* Avoid calling HttpContext.Request.GetUrl() where possible

As this allocates. Do our best to avoid allocation by using PathBase and Path directly, recording the original values in the tracking feature.

This assumes that the Scheme and Host don't change, but those seem like reasonable assumptions to make here

* Remove unused field

* Use TryDuckCast instead of DuckCast

* Fix HttpRequestIn_PopulateSpan test as we no longer populate resource name at the start of the pipeline

* [2.0] Add Linux installer smoke tests (#2036)

* minor: don't install so much in update-github-status-jobs.yml

* minor: remove checkout from generator steps

* Add an ASP.NET Core smoke test app

Sends a request to itself, confirms there's no errors, and crashes if there are. Enforces that the profiler is attached. Doesn't use the existing SampleHelpers to make building in docker easier

* Add a smoke-test stage for the Linux installers

- Builds and publishes an ASP.NET Core app using the .NET 6 SDK, for a specific framework version
- Downloads the linux installer binaries from the current pipeline
- Installs the binaries in a given runtime image
- Runs the smoke test application. Fails the job on non-zero exit code

* Add arm64 debian installer tests

* Add testing of fedora images

* Fix incorrect alpine tag for .NET Core 2.1

* [2.0] Relax duck type visibility requirements (#2045)

* Add [DoesNotReturn] to throw helpers

Fixes some null/code-flow IDE issues
Remove unused exception

* fix typo

* Remove unused method

* Remove visibility checks for DuckTypes

As we no longer support NET45, we don't need our duck types to be public.

* Couchbase instrumentation (#1925)

* Instrumentation
* ITests (no ARM)

* Switch to using curl to send AzDo build GitHub status (#2049)

Installing .NET 6 and using Nuke for this is a bit heavyweight, as it potentially slows down every stage by 2/3 mins. Using curl reduces that to 2s

* Remove .bat files from repository root (#1763)

* Remove script files from repository root

This is obviously opinionated, but I don't use them, so noone else can either, right?!

* Move pre-build-event-cpp.bat file to the native file (as that's where it's used anyway)

* Revert "Couchbase instrumentation (#1925)" (#2054)

This reverts commit 1bea1dfaa6e92f2c4736eed2defe5c608a848648.

* Fix snapshots introduced in #1938 but not renamed in #1966

* [2.0] Add ImmutableTracerSettings (#1951)

* Add ImmutableTracerSettings

Changing TracerSettings after they've been passed to a Tracer instance is not valid and could lead to unpredictable results. To try and emphasise that, this adds a new ImmutableTracerSettings type, that does not allow changing any properties

For backwards compatability, I've kept the TracerSettings object as not-obsolete, as the de-facto "builder" for ImmutableTracerSettings. This also avoids the need to use any base classes, avoids having to duplicate initialization code, and also avoids the need for a dedicated "builder" type for the immutable settings (we can't use `init set;` properties because of missing references unfortunately)

* Make some mutable properties immutable

Update the readonly properties on ImmutableTracerSettings to be "immutable" collections. The biggest change here is ImmutableTracerSettingsCollection, which is a little bit clumsy, but seemed to be the best way to make Integrations immutable-ish.

* Add test that immutable tracer settings is immutable

* Update public API

* Add unit test to confirm ImmutableTracerSettings contains all (most) of the properties on TracerSettings

* ImmutableIntegrationSettings is actually immutable!

* Delete AdoNetExcludedTypes as will be removed in #2008

* Use IReadOnlyDictionary for HeaderTags and GlobalTags

* Revert "Delete AdoNetExcludedTypes as will be removed in #2008"

This reverts commit 8c6602d1c19c55084c66a5153fe437ae1fd22257.

* Fix public API

* [2.0] Simplify IntegrationSettingsCollection (#2048)

* Replace usages of IntegrationInfo with IntegrationIds

Thanks to the version conflict resolution, we will never have integrations where the integration name is unknown. Therefore we don't need to keep the dictionary of names, and we can use IntegrationIds directly instead of IntegrationInfo

If a customer is using manual tracing, and calls TracerSettings.Integrations["SomeIntegration"] with an unknown integration name, we will now just return a "default" IntegrationSettings, and log a warning.

Replaced all the old usages of IntegrationInfo with IntegrationIds

* fix comment

* Rename IntegrationIds -> IntegrationId

* Fix obsolete API usages (#2056)

* Allow to run ITests locally on Mac (#2063)

* Restoring Couchbase instrumentation (#2064)

* Revert "Revert "Couchbase instrumentation (#1925)" (#2054)"
This reverts commit 803f27b573fc24a6cede4c3807ac502f49404744.

With the addition of a few changes around the sample app, and added one method for particular v3 versions.

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>

* Try switch dependabot generation to use pull_request_target (#2055)

* Try switch dependabot generation to use pull_request_target

This is all a bit confusing, but I think this will resolve our permission issues, and it's safe because we're only checking out code from `master/main`, never the PR code itself

* Bump .NET SDK version used in workflows

* Remove the newline between the log message and properties (#2070)

Currently the properties (e.g. `{ MachineName: ".", Process: "[44096 iisexpress]", AppDomain: "[2 /LM/W3SVC/1/ROOT-1-132820633973714685]", TracerVersion: "1.30.0.0" }`) are written on a new line _after_ the log line. This can make it difficult to tell whether a given set of properties is related to the line above or the line below. This change removes the newline.

* [2.0] Split integration id `AdoNet` into individual ids per db type (#2008)

Co-authored-by: Zach Montoya <zach.montoya@datadoghq.com>

* remove unused class MethodBuilder and test references (#2074)

* Fix disabled integrations log message (#2078)

Was trying to serialize the whole integration setting

* [AppSec] Upgrade WAF rule set (#2086)

* Upgrade WAF rule set

* Update ruleset in unit tests too and waf tests

Co-authored-by: Anna <anna.yafi@datadoghq.com>

* [AppSec] Change waf initialization to have better logging (#2068)

* Refactoring of waf initialization, no more static constructor, split classes

* Link the ruleset instead of duplicating it in unit tests

* Remove CIApp Feature Tracking. (#2088)

* Just add a pointer to the ms doc (#2093)

* Add space in log line (#2094)

* [AppSec] Report attacks by the _dd.appsec.triggers tag (#1923)

* Report attacks by the _dd.appsec.triggers tag

It's been decided to sent the app sec event data via a tag to remove the
need for the backend to correlated two data sources.

I also used this to fix several related issue:
- report all attacks generated by the waf
- report all matches within an attack using the new rule_matches field
- add the standard logs about attack data

APPSEC-1839
APPSEC-1541
APPSEC-983

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>
Co-authored-by: Pierre Bonet <pierre.bonet@datadoghq.com>

* Redesign package version generator to allow more granular control (#2091)

* Bump package versions with existing task

* Delete unused file

* Rename Program -> PackageVersionGenerator

* Expand PackageVersionEntry format

This allows us to be more flexible about which package versions we run. This is necessary as certain frameworks that we support cease being supported by NuGet packages (e.g. npgsl 6.0 isn't supported on netcoreapp2.1)

Takes the target frameworks from the samples themselves, so shouldn't need updating often

* Run GeneratePackageVersions with new format

The generated files are much more granular, but should be equivalent

* Downgrade tests on MSTest to 2.2.7 - there's a bug in the instrumentation of 2.2.8

* CallTarget ByRef (#2092)

* Initial work for calltarget by ref.

* Fix CallTarget ref

* Fixes

* Fix mapper.

* Fixes.

* revert config.

* Ensure compatibility when using and old native profiler by keeping an overload without the ref.

* Ensure profiler compatibility with older managed version, by a new PInvoke to enable byref instrumentation.

* Fix native test.

* Ensures arguments are by ref in the managed side.

* Add helper for collecting host metadata (#2097)

* Add helper for collecting host metadata

i.e. Hostname, Linux kernel information
Required for some new features (e.g. log shipping, telemetry)

* Use EnvironmentHelpers and add named params

* CallTarget ByRef: Fix ByRef Generics arguments (#2099)

* Fix ByRef Generic arguments

* Update tracer/src/Datadog.Trace/ClrProfiler/CallTarget/Handlers/IntegrationMapper.cs

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>

* [Appsec] Fix duplicate key in cookies when building waf args (#2098)

* Fix duplicate key in cookies

* Make sure no duplicate key

* Uniformize order

* Fix looping through cookies, otherwise always same value

* Add lists for all headers, cookies, query strings to make sure we handle them in case of different IIS behavior

* Specializing for .net core with try add

* Fix  else query string

* [2.0] Add modified WCF CallTarget instrumentation via opt-in environment variable (#2041)

* Port changes from DataDog/dd-trace-dotnet#1992 into the release/2.0 branch.

Additional changes include:
- Move CreateScope method from ChannelHandlerIntegration.cs to WcfCommon.cs
- Instead of breaking the CallTarget native structures (causing the original PR to be reverted for 1.30.1), use the new CallTarget ByRef instrumentation to instrument WCF methods which have out parameters (thanks Tony!)
- Remove testing of CallSite codepaths in WcfTests

* [2.0] remove leftover references to `integrations.json` and `DD_INTEGRATIONS` (#2073)

* remove references to integrations.json and DD_INTEGRATIONS

* fix build step description (but don't change it's name to avoid breaking CI)

* remove DD_INTEGRATIONS from AppSec loader

* Change the CallTargetReturn struct to a ref struct (#2103)

* Update supported ADO.NET versions (#2077)

* Bump supported versions of ADO.NET providers

* Update generator definitions

Note that we have not been testing the latest Npgsql release

Also Npgsql 6.0.0 does not support netcoreapp2.1 or netcoreapp3.0

* Update package versions

* Don't use Assembly.LoadFrom() with v4.0.0 as it crashes

Looks unrelated to the profiler, prob a bug in 4.0.0

* Version 4.0.0 doesn't work on Linux (can't connect)

* Explicitly set CheckEolTargetFramework=false in linux sample builds

For some reason these aren't picking it up from elsewhere

* Microsoft.Data.Sqlite hangs on Alpine with 6.0.0 on .NET Core 3.0

So skip the test. It hangs whether we are attached or not, so skip the test.

* Fix host metadata kernel information (#2104)

* update CLSID_CorProfiler after upstream

* update old log path after rebase

* fix max System.ServiceModel version. There is no 5.0 version

* fix DD_ env variables

* env variables - ultimate ultimate-pipeline

* propagate Propagators from TracerSettings to ImmutableTracerSettings

* use TestHelper class instead of direct call to Tracer constructor

* Use real command for CreateDbCommandScope_SetsDbStatementBasedOnCommandText test
There is no longer support dor non standard IDbCommands implementations

* remove not used property ConfigurationSource

* remove setter from ImmutableTracerSettings

* fix TracerManager documentation comment

* propagate plugins from Instrumentation class

* bump donet sdk to 6.0 for CI pipeline

* fix compilation issues after next-ver merge

* remove .net 4.5.2 from integration tests and upstream sync documentation
support was dropped on upstream

* fix compilation issues

* remove leftovers for SIGNALFX_INTEGRATIONS

* install all neded dotnet versions

* fix reference to snk file

* use DataDog.Agent exporter for ReplacingGlobalTracerManagerMidTraceWritesTheTrace

* use DataDogAgent exporter in OpenTracingSendTracesToAgent

* fix internalsvisible to argument for CallTargetNativeTest

* update verified files

* fix OpenTracingSendTracesToAgent

* verified files for .NET6

* verified files for .NET5, .NET 4.6.1 .NET Core 3.1

* fix Couchbase tests

* fix HttpContextKey typos

* update WcfTests verified files

* update OwinWebApi2Tests based on next-ver branch

* fix launchSettings.json form Samples.Couchbase

* update documentation after upstream sync

* Skip Couchbase3 tests
locally, upstream code ends with the same issue:
Expecting at least 9 spans, only received 0

* update upstream sync documentation

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>
Co-authored-by: Colin Higgins <colin-higgins@users.noreply.github.com>
Co-authored-by: Anna <anna.yafi@datadoghq.com>
Co-authored-by: Gregory LEOCADIE <gregory.leocadie@datadoghq.com>
Co-authored-by: Tony Redondo <tony.redondo@datadoghq.com>
Co-authored-by: Kevin Gosse <krix33@gmail.com>
Co-authored-by: Zach Montoya <zach.montoya@datadoghq.com>
Co-authored-by: Lucas Pimentel-Ordyna <lucas.pimentel@datadoghq.com>
Co-authored-by: Pierre Bonet <pierre.bonet@datadoghq.com>
Co-authored-by: Robert Pickering <robert.pickering@datadoghq.com>
andrewlock added a commit that referenced this pull request Nov 21, 2025
## Summary of changes

Fixes orphan traces in WCF

## Reason for change

Many customers have reported that in v3, with "delayed WCF"
instrumentation enabled, they get orphan spans. This can be easily
reproduced, so this PR fixes the issue.

## Background

There are two "modes" WCF can operate in: delayed instrumentation, and
"simple" instrumentation. Delayed instrumentation was introduced in
#1992 to avoid a
problematic error scenario. Instead of instrumenting early across the
whole "channel", we delayed instrumentation. A subsequent PR
#5206 updated the
instrumentation point to ensure the wcf span was active while
`IDispatchMessageInspector` were running.

This was somewhat complex, due to the convoluted and legacy way wcf uses
async and threading. Basically, we use a weak table to hold the current
span, and us the `MessageRpc` object as a key. Later on, we retrieve the
scope and dispose it.

## Implementation details

The missing part was that we were not "restoring" the wcf scope for when
the actual Invoker implementations executed. The process was:

- Update the tests and snapshots to:
- Record _all_ spans and traces (it makes it easier to [understand the
snapshots
using](https://andrewlock.github.io/DatadogSpanSnapshotViewer/))
  - Create a span inside the "service" calls.
- These _should_ have been children of the wcf server spans, but they
were not. They were orphaned
- Update the "Invoker" implementations to grab the wcf spans, make it
active, and then deactivate it afterwards
- Regenerate the snapshots, and confirm the custom spans are no longer
orphaned 🎉

Additionally, do a small amount of refactoring:
- "Fix" nullability in some places
- Extract the "restore wcf span" functionality to `WcfCommon` as it's
used in multiple places

## Test coverage

Covered by existing, after the tweaks above. The snapshots change like
this:

Before the fix, after adding the additional custom spans,
`WSHttpBinding` traces:

<img width="1468" height="622" alt="image"
src="https://github.com/user-attachments/assets/7973514c-660e-4871-8c4d-7495473e9944"
/>

After the fix, `WSHttpBinding` traces:

<img width="1459" height="96" alt="image"
src="https://github.com/user-attachments/assets/0698e5d3-cb06-4ea6-ab00-5a3680f7e3ee"
/>

Before the fix, after adding the additional custom spans, WebHttp
traces:

<img width="1458" height="268" alt="image"
src="https://github.com/user-attachments/assets/b0a9f4c9-8584-4fa2-9441-74268d6fd279"
/>

After the fix, Web Http traces

<img width="1465" height="126" alt="image"
src="https://github.com/user-attachments/assets/b06f0255-0818-4516-a6a1-b1807c9781ad"
/>

Unfortunately, this ends up being somewhat unstable in regards to
sorting, so I did some more refactoring to separate the root span for
each separate httpcall. This should make sure we have more stable
ordering of the spans using the built-in sort order

The result is we get separate traces for each API call instead:

<img width="1458" height="713" alt="image"
src="https://github.com/user-attachments/assets/989a0cc8-befa-429f-9a9e-ede09c2abc48"
/>



## Other details

I wonder if we really need a weak table here? 🤔 seeing as we have an
instrumentation point that explicitly disposes the span, shouldn't we
just remove the entry at the same time?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:automatic-instrumentation Automatic instrumentation managed C# code (Datadog.Trace.ClrProfiler.Managed) type:bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants