Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baggage propagation is not possible for instrumentations without depending on OpenTelemetry.Api package #5667

Open
lmolkova opened this issue Jun 1, 2024 · 13 comments
Labels
needs-runtime-change Issues which likely require changes from dotnet runtime - typically DiagnosticSource package pkg:OpenTelemetry.Api Issues related to OpenTelemetry.Api NuGet package

Comments

@lmolkova
Copy link

lmolkova commented Jun 1, 2024

Package

OpenTelemetry.Api

Package Version

Package Name Version
OpenTelemetry.Api 1.8.0
OpenTelemetry 1.8.0

Description

In Azure SDKs we want to be able to propagate baggage between messaging producer and consumer.

However there are two baggages:

  • System.Diagnostics.Acitivity.Baggage - legacy(?) API that uses Correlation-Context header (by default via default implementation of DistributedContextPropagator)
  • OpenTelemetry.Baggage.Current - OTel baggage available in OpenTelemetry.Api package.

OTel discourages usage of Activity.Baggage:

OpenTelemetry users should not use the `Activity.AddBaggage` method.

As a result:

  • It's not possible to use OTel baggage without taking dependency on OpenTelemetry.Api - this is a showstopper for Azure SDK instrumentations
  • Users have inconsistent baggages that don't work together

Expected Result

OTel and BCL should provide just one way to work with the baggage.

In the meantime, OTel can still provide consistent experience in the following way:

  1. Implement System.Diagnostics.DistributedContextPropagator that does OTel-compatible baggage propagation: W3C + baggage header. It also reads any context provided in the Activity.Baggage (and never sets it)
  2. Configure that propagator on DistributedContextPropagator.Current - at least as an opt-in mechanism

Here's a PoC (tested for general happy case)

class OTelBridgePropagator : DistributedContextPropagator
{
    private readonly TextMapPropagator _inner;
    public OTelBridgePropagator(TextMapPropagator propagator)
    {
        _inner = propagator;
        Fields = new ReadOnlyCollection<string>(_inner.Fields.ToArray());
    }

    public override IReadOnlyCollection<string> Fields { get; }

    private PropagationContext Extract(object? carrier, PropagatorGetterCallback getter)
    {
        return _inner.Extract(default, carrier, (c, n) =>
        {
            getter.Invoke(c, n, out var value, out var values);
            return values;
        });

    }
    public override IEnumerable<KeyValuePair<string, string?>>? ExtractBaggage(object? carrier, PropagatorGetterCallback? getter)
    {
        if (getter == null)
        {
            return null;
        }

        return Extract(carrier, getter).Baggage.GetBaggage();
    }

    public override void ExtractTraceIdAndState(object? carrier, PropagatorGetterCallback? getter, out string? traceParent, out string? traceState)
    {
        traceParent = null;
        traceState = null;

        if (getter != null)
        {
            var context = Extract(carrier, getter);
            var flags = (context.ActivityContext.TraceFlags == ActivityTraceFlags.Recorded) ? "01" : "00";
            traceParent = $"00-{context.ActivityContext.TraceId}-{context.ActivityContext.SpanId}-{flags}";
            traceState = context.ActivityContext.TraceState;
        }
    }

    public override void Inject(Activity? activity, object? carrier, PropagatorSetterCallback? setter)
    {
        if (setter == null)
        {
            return;
        }

        if (activity != null)
        {
            foreach (var kvp in activity.Baggage)
            {
                Baggage.Current.SetBaggage(kvp.Key, kvp.Value);
            }
        }

        var context = new PropagationContext(activity?.Context ?? default, Baggage.Current);
        _inner.Inject(context, carrier, setter.Invoke);
    }
}

Configuration

var otelPropagator = new CompositeTextMapPropagator(
    new TextMapPropagator[] {
        new TraceContextPropagator(),
        new BaggagePropagator(),
    });
Sdk.SetDefaultTextMapPropagator(otelPropagator); 
DistributedContextPropagator.Current = new OTelBridgePropagator(otelPropagator); // this should be done by OTel SDK

As a result:

  • users can add baggage using Baggage.Current or via Activity.Baggage - both will end up in baggage header (or whatever is configured withSetDefaultTextMapPropagator
  • Client/instrumentation libraries can use DistributedContextPropagator.Current to propagate context - it will be consistent with one in the Sdk.SetDefaultTextMapPropagator

Related issues:

@lmolkova lmolkova added the bug Something isn't working label Jun 1, 2024
@github-actions github-actions bot added the pkg:OpenTelemetry.Api Issues related to OpenTelemetry.Api NuGet package label Jun 1, 2024
@lmolkova
Copy link
Author

lmolkova commented Jun 1, 2024

Happy to contribute the change if there are no objections.

@cijothomas
Copy link
Member

However there are two baggages:

System.Diagnostics.Acitivity.Baggage - legacy(?) API that uses Correlation-Context header (by default via default implementation of DistributedContextPropagator)
OpenTelemetry.Baggage.Current - OTel baggage available in OpenTelemetry.Api package.

Yes. A problem we've been facing from day1, thanks for looking into this!

  1. Now that W3CBaggage spec has moved to be stable, this issue needs higher priority. We need to see if .NET Runtime is willing to offer a Baggage API (independent of Activity). Not likely to happen in .NET 9 timeframe, but .NET 10 can be targeted.

  2. Also, OTel's own propagator API should be deprecated in favor of the ones coming from runtime. There were some investigations done on how to achieve this, but nothing solid yet.

Is the above proposal meant to short-term unblock libraries (like Azure SDK for example), to benefit from Baggage, without taking a dependency on OTel.API? While I don't object to that, I think we should solve the above 2 fundamental issues at the root, so there won't be a need of any workarounds. 2 is solvable just within OTel repo, but 1 needs runtime support (i don't think there is a tracking issue, so need to create one).

@cijothomas cijothomas added needs-runtime-change Issues which likely require changes from dotnet runtime - typically DiagnosticSource package and removed bug Something isn't working labels Jun 3, 2024
@lmolkova
Copy link
Author

lmolkova commented Jun 3, 2024

Since the problem in not going to be fixed in at least 1.5 years, I think workaround that does not introduce any public API and will disappear without the trace along with OTel propagators is quite reasonable.

It's not going to solve p1, but it will:

  • make .NET native HTTP instrumentations inject the context using the propagator configured by users on OTel. I don't know how otherwise we can have native .NET instrumentations
  • support baggage in non-HTTP instrumentations without dependency on OTel.Api

The core part of the above proposal is to do DistributedContextPropagator.Current = new OTelBridgePropagator(otelPropagator); in the OTel SDK whenever someone calls Sdk.SetDefaultTextMapPropagator (or when tracer provider is built with a default one).
So users can keep using either: Baggage.Current or Activity.Baggage (if that's what they did), there is no need to unify those APIs to make the propagator bridge work.

I'm also fine not adding any support for Activity.Baggage in the bridge propagator.

@vishweshbankwar
Copy link
Member

vishweshbankwar commented Jun 3, 2024

DistributedContextPropagator.Current = new OTelBridgePropagator(otelPropagator); // this should be done by OTel SDK

@lmolkova FYI - This could run into some sequencing issues and may not work in all cases if set within Sdk. See https://github.com/open-telemetry/opentelemetry-dotnet/pull/3769/files#r1003730495.

Overall I think this intermediate step will also be needed for supporting OTel Baggage with native instrumentation coming in .NET9.0. (Assuming we don't get baggage API in .NET)

@lmolkova
Copy link
Author

lmolkova commented Jun 3, 2024

This could run into some sequencing issues and may not work in all cases if set within Sdk. See https://github.com/open-telemetry/opentelemetry-dotnet/pull/3769/files#r1003730495.

Good catch! This is not a new problem though. The Sdk.SetDefaultTextMapPropagator should also be called at the right time by the user.

Overall I think this intermediate step will also be needed for supporting OTel Baggage with native instrumentation coming in .NET9.0.

Not just baggage, but any non W3C context propagation - if users configured B3 on otel, native instrumentations need to be aware of this.

As an alternative, OTel may make OTel propagators implement DistributedContextPropagator, deprecate Sdk.SetDefaultTextMapPropagator and ask users to set DistributedContextPropagator.Current first thing after app startup (however ugly it is)

@cijothomas
Copy link
Member

So users can keep using either: Baggage.Current or Activity.Baggage (if that's what they did), there is no need to unify those APIs to make the propagator bridge work.

mm.. Not so sure given there is nothing taking care of in-proc propagation. The propagator API tweak can only fix part of the problem by propagating baggages set via both Activity.Baggage and Baggage via "baggage" header(or whatever configured header). But within process, neither Activity.Current.Baggage nor Baggage.Current is aware of each other....

@cijothomas
Copy link
Member

Overall I think this intermediate step will also be needed for supporting OTel Baggage with native instrumentation coming in .NET9.0.

Not just baggage, but any non W3C context propagation - if users configured B3 on otel, native instrumentations need to be aware of this.

This is already broken even today. The proposed native instrumentation in .NET 9 is about populating attributes (tags). Asp.Net Core/HttpClient are already natively doing the context propagation part, and they don't respect OTel Propagators. (this is compensated for by our instrumentation libraries, by sometimes doing the extreme of throwing away the Activity created from the asp.net core, and creating a fresh one following otel propagators, leading to poor perf!)

As an alternative, OTel may make OTel propagators implement DistributedContextPropagator, deprecate Sdk.SetDefaultTextMapPropagator and ask users to set DistributedContextPropagator.Current first thing after app startup (however ugly it is)

This makes sense. OTel's propagators must deprecate itself, and fully embrace DistributedContextPropagator from runtime. (just like OTel .NET decided to sacrifice its own API in favor of Activity API from runtime)

@lmolkova
Copy link
Author

lmolkova commented Jun 3, 2024

mm.. Not so sure given there is nothing taking care of in-proc propagation. The propagator API tweak can only fix part of the problem by propagating baggages set via both Activity.Baggage and Baggage via "baggage" header(or whatever configured header). But within process, neither Activity.Current.Baggage nor Baggage.Current is aware of each other....

Sure, but it seems the baggage API and propagation are two independent concepts, right?
So we can break this problem down in two separate things:

  • how to add baggage, which API to use, etc - this is runtime issue to solve. Not sure if there an existing one already, happy to create.
  • propagator implementation compatible with OTel and .NET - e.g. B3 DistributedContextPropagator should probably exist in OTel, but not necessarily .NET

@lmolkova
Copy link
Author

lmolkova commented Jun 3, 2024

It seems we're in agreement that we need to use DistributedContextPropagator and sunset OTel propagators. I can join .NET SIG tomorrow to discuss it further.

@cijothomas
Copy link
Member

It seems we're in agreement that we need to use DistributedContextPropagator and sunset OTel propagators. I can join .NET SIG tomorrow to discuss it further.

Yes💯! It was always the plan, but never had the bandwidth to make it happen. @vishweshbankwar did some exploration, and should be able to assist as well.

I don't have strong preference over keeping OTel propagators as a thin wrapper around DistributedContextPropagator, similar to how TelemetrySpan API acts as a thin wrapper around Activity, on an opt-in basis. We will need to be fully back-compatible, as we are not planning major version bump to 2.0!

Anyway, these are details, and can be addressed in implementation phase.

@martinjt
Copy link
Member

martinjt commented Jun 5, 2024

I'd be interested as to what came out of the discussion at the SIG, can someone detail that here?

@lmolkova
Copy link
Author

lmolkova commented Jun 5, 2024

sure, the summary:

  • this is the problem for any instrumentation, including native ones coming in .NET 9
  • there is a consensus on relying purely on DistributedContextPropagator and sunsetting TextMapPropagators
  • the Otel-bridge implementation of DistributedContextPropagator needs to do some tricks
    • don't break Activity.Baggage + Correlation-Context (i.e. if someone uses it, it should keep working)
    • it has to populate Baggage.Current in extract methods
  • ASP.NET Core caches the value of DistributedContextPropagator.Current
    • when user sets DistributedContextPropagator.Current, they control when it happens
    • when otel is built it might be too late and won't update the instance ASP.NET Core uses
      • I will check what can be done and reach out to ASP.NET Core to change things if necessary
      • Perhaps it's not too late to change how ASP.NET Core uses Activity.Baggage (maybe they can let users opt-out or configure baggage instead of correlation-context) - I will check

Assuming we can make things above work, there are no objections from going ahead with DistributedContextPropagator implementations, it was the plan, SIG just didn't get to it.
I should be able to work on it (but can't make time commitments beyond ASP.NET Core investigation in v9 timeframe).

@cijothomas
Copy link
Member

this is the problem for any instrumentation, including native ones coming in .NET 9

They are no longer part of .NET 9 :(
dotnet/aspnetcore#52439 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-runtime-change Issues which likely require changes from dotnet runtime - typically DiagnosticSource package pkg:OpenTelemetry.Api Issues related to OpenTelemetry.Api NuGet package
Projects
None yet
Development

No branches or pull requests

4 participants