Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environment Variables as Carrier for Inter-Process Propagation to transport context #740

Open
mponnambalam opened this issue Jul 25, 2020 · 33 comments
Labels
release:after-ga Not required before GA release, and not going to work on before GA spec:context Related to the specification/context directory triage:accepted:needs-sponsor Ready to be implemented, but does not yet have a specification sponsor

Comments

@mponnambalam
Copy link

Passing context information while a process being spawn as part of a span using a shell script or process builders.

Propagating context information while creating a process for another process, to trace the lineage, there isn't any protocol such as Http/gRPC to transport context. A common way to pass information using Environment Variables, which shall be utilized as a carrier.

Environment variables have differences between the operating systems, such as case sensitivity in Linux and case insensitivity in windows.

Defining specification/guidance, to have a common format and method for propagation using Environment Variables, will be solving many ETL tools and batch cases.

@mponnambalam mponnambalam added the spec:context Related to the specification/context directory label Jul 25, 2020
@carlosalberto carlosalberto added the release:after-ga Not required before GA release, and not going to work on before GA label Jul 28, 2020
@cyrille-leclerc
Copy link
Member

cyrille-leclerc commented Aug 24, 2021

FYI an ecosystem of CI/CD platforms and tools is rapidly growing around propagation of Otel trace contezxt using the TRACEPARENT and TRACECONTEXT environment variables and we think we are mature enough to propose a standardisation of trace context propagation using environment variables.

Existing integrations of CI/CD tools using OpenTelemetry trace context propagation through environment variables (TRACEPARENT and TRACECONTEXT):

cc @tobert , @v1v, @kuisathaverat

Jenkins Maven pipeline

Jenkins otel-cli pipeline

Ansible pipeline

@tobert
Copy link

tobert commented Aug 25, 2021

I am the author of https://github.com/equinix-labs/otel-cli and have been using TRACEPARENT propagation over envvars for a few months now. We're doing this work in production at Equinix Metal, and I believe a few others are using otel-cli in production now including GitHub. Please let me know how I can help make this OTEP happen.

@Div95
Copy link

Div95 commented Sep 15, 2021

Hi, we have developed a propagator that helps propagate trace/baggage details across processes using environment variable as the carrier. It supports different formats for the trace details (b3/w3c/ some custom format).

I am author of https://github.com/Div95/opentelemetry-python/tree/feature/env_propagator/propagator/opentelemetry-propagator-env which has this solution in Python language.
We are making use of 'os.environ' for carrier here.

We'd like to help move this issue forward.

@aidansteele
Copy link

@cyrille-leclerc you mention TRACECONTEXT but it looks like those tools actually use TRACESTATE - can you confirm if that's just a typo?

@aidansteele
Copy link

aidansteele commented Sep 21, 2021

Thanks to @tonistiigi in rakyll/go-test-trace#2 I have learned that Buildkit uses OTEL_TRACE_PARENT and OTEL_TRACE_STATE. It looks like this was inspired by opentelemetry-swift using these in it's EnvironmentContextPropagator. It was added in open-telemetry/opentelemetry-swift#170.

@tonistiigi
Copy link

@aidansteele Indeed, these are exposed to every process running in Dockerfile RUN statement when tracing is enabled, as well as the env for otlp (forwarder) socket that is exposed to the container. Not a huge issue if we should need to switch to a different env or support both, but to my taste OTEL_ variables look less ambiguous and more in line with the env variables already defined by the spec.

@rakyll
Copy link
Contributor

rakyll commented Sep 22, 2021

Traceparent and Tracestate from W3C Trace Context specification are not a part of OTel. W3C Trace Context is a standalone spec, Otel is one of the implementors. If we choose to prefix things with OTEL_, it'd not be great for other instrumentation libraries that want to provide propagation via environmental variables. They will eventually came up with their own and it will fragment the space. It'd be better if we don't tightly couple the environment variable name with OTel, but I agree that TRACEPARENT and TRACESTATE are not specific enough.

@tonistiigi
Copy link

cc @nachoBonafonte @SergeyKanzhelev (from the EnvironmentContextPropagator implementation)

@tobert
Copy link

tobert commented Sep 22, 2021

otel-cli is using TRACEPARENT and I've put it into a lot of code and there is a lot to be said for keeping the short name in preference to more descriptive names. It doesn't matter much for otel-cli but a really common mod I've made is adding traceparent headers to curl calls in shell scripts. otel-cli might generate the span parent then we do:

curl -H "traceparent: ${TRACEPARENT:-}" $url

Maybe I'm too deep into this stuff and acclimated to the variable names, but this reads well and looks obvious in the code I've submitted to various projects. Making the variable names longer would make it a bit more awkward to inject this into existing code and the mismatched header/variable name feels wrong to me.

@yurishkuro
Copy link
Member

but I agree that TRACEPARENT and TRACESTATE are not specific enough

They seem no more or less specific than the respective HTTP headers. @rakyll do you have some specific downsides in mind from using these as is, in accordance with the W3C spec for http headers?

Traceparent and Tracestate from W3C Trace Context specification are not a part of OTel. W3C Trace Context is a standalone spec, Otel is one of the implementors.

On the other hand, I am not aware of any standards body that can bless the name of an environment variable.

My vote is to use TRACEPARENT and TRACESTATE, and optionally provide a config capability in the propagator to override the names should some users come across naming conflicts.

@cyrille-leclerc
Copy link
Member

Using TRACEPARENT and TRACESTATE felt natural when we built the Jenkins OpenTelemetry Plugin and the OpenTelemetry Maven Extension and it seamlessly integrated with @tobert's otel-cli.

The integration using TRACEPARENT and TRACESTATE was so simple and so intuitive that we realized the potential Otel has to provide observability on the whole CI/CD toolchain.

It would be great if we could just use those environment variables and, as @yurishkuro said, optionally support specifying alternate environment variables.

@rakyll
Copy link
Contributor

rakyll commented Sep 23, 2021

They seem no more or less specific than the respective HTTP headers. @rakyll do you have some specific downsides in mind from using these as is, in accordance with the W3C spec for http headers?

Headers are scoped to a request, environmental variables are scoped to the environment and has a larger radius. That's where my worry is coming from.

We are using TRACEPARENT right now in go-test-trace, rakyll/go-test-trace@2249361.

@nachoBonafonte
Copy link
Member

nachoBonafonte commented Sep 24, 2021

cc @nachoBonafonte @SergeyKanzhelev (from the EnvironmentContextPropagator implementation)

Hi, in opentelemetry-swift we just needed the functionality and as there was no approved spec for it we just followed the naming guidelines for other environment variables.
We thought that using a very generic name may clash with other tracing libraries or systems. Using something like W3C_TRACE_PARENT was also in the list of possible names, which is both specific and clarifies what it is about. Said that, we have no issues with changing the name to whatever is approved, and we will maintain compatibility for a time.

@cyrille-leclerc
Copy link
Member

FYI Concourse CI also uses the TRACEPARENT environment variable for context propagation.

I have just submitted a PR on the Concourse CI documentation to make it clear: concourse/docs#462

@yurishkuro
Copy link
Member

It seems we have a general consensus on using TRACEPARENT/TRACESTATE unadorned. I'd say it's safe to move to creating a PR to reflect this in the spec.

@v1v
Copy link

v1v commented Sep 26, 2021

The Opentelemetry callback plugin for Ansible also uses TRACEPARENT -> ansible-collections/community.general#3378

@deejgregor
Copy link

Is anyone working on a PR for this change to the spec (or planning to work on one)? If not, I'm happy to work on a PR. I'm very new to OpenTelemetry, so if there is anyone who would like to volunteer to answer some questions (if needed) and do an initial review, that would be great for this newb.

@wajika
Copy link

wajika commented Dec 24, 2021

FYI an ecosystem of CI/CD platforms and tools is rapidly growing around propagation of Otel trace contezxt using the TRACEPARENT and TRACECONTEXT environment variables and we think we are mature enough to propose a standardisation of trace context propagation using environment variables.

Existing integrations of CI/CD tools using OpenTelemetry trace context propagation through environment variables (TRACEPARENT and TRACECONTEXT):

cc @tobert , @v1v, @kuisathaverat

Jenkins Maven pipeline

Jenkins otel-cli pipeline

Ansible pipeline

Are there any relevant solutions for other compiled languages? We use dotnet and azure devops server CI pipeline.

@cyrille-leclerc
Copy link
Member

Thanks @wajika

Are there any relevant solutions for other compiled languages? We use dotnet and azure devops server CI pipeline.

We see initiatives to instrument unit tests (using the JUnit report format) and progress on BuildKite (here) but unfortunately I'm not aware of progress on the .Net / Azure ecosystem yet.

@stmlange
Copy link

stmlange commented Aug 29, 2023

Hello,
thanks for starting this discussion. Maybe it's just me, but I'm currently observing a discrepancy in the casing of how those environment variables are referenced.

For context I'm just using a W3CTraceContextPropagator (W3CTraceContext documentaion).

The Jenkins OpenTelemetry Plugin seems to provide the TRACECONTEXT / TRACESTATE variable (upper cased).

When I inject the current context into a map like so:

private static final TextMapSetter<Map<String, String>> TEXT_MAP_SETTER = Map::put;
....
final Context context = Context.current();
final Map<String, String> contextMap = new HashMap<>();
        GlobalOpenTelemetry.get()
                .getPropagators()
                .getTextMapPropagator()
                .inject(context, contextMap, TEXT_MAP_SETTER);

it results in a tracecontext / tracestate variable (lower cased).

Since environment variables are case sensitive and the W3CTraceContext documentaion refers to them as only lower cased: Should this be considered a bug in the Jenkins OpenTelemetry Plugin, or is perhaps worth to discuss in this more broader context, since it can affect any other Inter-Process Carrier Propagation?

@trask
Copy link
Member

trask commented Aug 29, 2023

hi @stmlange, my understanding is that W3CTraceContext is only about propagating over HTTP headers

my suggestion in your code above would be to have your TextMapSetter handle the upper casing when propagating over env vars

@deejgregor
Copy link

@yurishkuro said awhile back:

It seems we have a general consensus on using TRACEPARENT/TRACESTATE unadorned. I'd say it's safe to move to creating a PR to reflect this in the spec.

I'm curious if this would count as a "Yes" per this:

Follow the issue workflow and make sure the issue is accepted with a "Yes" response. If the response to the issue is not "Yes" then do not create a PR that implements the change since it will be rejected.

-- source

If not, what is the right path to get to a Yes (or No, etc.) for this issue?

@yurishkuro
Copy link
Member

@deejgregor yes, I believe we have the agreement, and judging by the number of different PRs linked to this issue there are plenty of implementations that already use this approach. Are you willing to create an OTEP PR? In it I would cover

  • motivation / use cases
  • current proposal
  • trade-offs
  • alternatives considered / reasons they were not chosen

@deejgregor
Copy link

deejgregor commented Sep 14, 2023

@yurishkuro:

Are you willing to create an OTEP PR?

Yup! Finally... after first aiming to do this in late 2021, heh. I've started researching, and I'll make sure the items you mentioned are included.

I've noticed one interesting thing so far: we aren't terribly explicit in the spec about how we propagate other than mentioning the propagation distributions, which mostly cover how to encode in HTTP (request) headers and not much else (in the case of B3, it does mention what to do for gRPC and JMS) . The trace semantic conventions for messaging have probably the most explicit mention of how to attach the context information, if only to say it isn't explicitly covered yet. ;-)

This document does not specify the exact mechanisms on how the creation context is attached/extracted to/from messages. Future versions of these conventions will give clear recommendations, following industry standards including, but not limited to Trace Context: AMQP protocol and Trace Context: MQTT protocol once those standards reach a stable state.

-- source

@krzko
Copy link

krzko commented Nov 16, 2023

I've also added TRACEPARENT injection to the shell in https://github.com/krzko/run-with-telemetry, which compliments the New component: Github Actions Event Receiver issue.

@mponnambalam
Copy link
Author

mponnambalam commented Dec 6, 2023 via email

@MSNev
Copy link
Contributor

MSNev commented Feb 5, 2024

Seems to be related: #3825, if you consider inter-process to be a "client"

@danielgblanco
Copy link
Contributor

danielgblanco commented Jun 3, 2024

This is related to open-telemetry/oteps#258. @adrielp just making you aware as you raised the OTEP PR

@danielgblanco
Copy link
Contributor

@carlosalberto @yurishkuro as I've seen you commented on the OTEP PR linked above, are any of you sponsoring this?

yangskyboxlabs added a commit to yangskyboxlabs/oteps that referenced this issue Aug 12, 2024
This OTEP attempts to address the usage of TRACEPARENT and TRACESTATE
environment variables as a means to propagate tracing context to
subprocesses.  Such usage is already implemented by various tools in the
wild, and has essentially become a de facto standard.

open-telemetry/opentelemetry-specification#740

This aims to replace an earlier OTEP 258 which appears to have stalled.
While the core idea presented is the same, this OTEP has a slightly
narrow scope, and do not derive from OTEP 258 or its unpublished
predecessor OTEP 241.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release:after-ga Not required before GA release, and not going to work on before GA spec:context Related to the specification/context directory triage:accepted:needs-sponsor Ready to be implemented, but does not yet have a specification sponsor
Projects
Status: Spec - Accepted
Development

No branches or pull requests