Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Tracer Components #10

Closed
discostu105 opened this issue Apr 29, 2019 · 29 comments
Closed

Proposal: Tracer Components #10

discostu105 opened this issue Apr 29, 2019 · 29 comments
Assignees
Milestone

Comments

@discostu105
Copy link
Member

Proposal: Instead of a global tracer singleton, every library needs to create it's own tracer, which also has a proper name that identifies the library/component that uses the tracer.

Motivation: "Configurability of library/component instrumentation"

  • Identify the library/component that created a span
  • Be able to turn the instrumentation of a certain library/component on/off. This might be necessary if the instrumentation
    • causes too much overhead
    • is buggy (e.g. does not call span.end in every case)
    • avoid "double-tracing". E.g. a tracing vendor (e.g. Dynatrace) might already have built-in auto-instrumentation for a certain library. In this case the vendor might want to decide (automatically or config-based) which of the implementations should be enabled/disabled.

Technically, this could be achieved similar to how logging frameworks deal with Loggers and LoggerFactories (log4j).

Tracer tracer = OpenConsensus.Tracing.getTracer("io.opensomething.contrib.mongo");

Every span would contain the information about the Tracer that it was created with. Based on that a span can be discarded or not.

Not sure how a naming convention could look like. Not sure if "component" is the right word, as it has already been used in the past as attribute name. At Dynatrace we call that a "Sensor". Not sure about granularity of "Tracers". Should every class have one? Should Tracing act as a registry (cache already created instances) or as a factory (just create new ones at every call)?

This is kind of related to the GlobalTracer discussion (#107).

@pavolloffay
Copy link
Member

Some more thoughts, should every tracer in the registry support full configuration? For instance different sampling, reporting mechanism. Perf-wise I would prefer a cached registry and also a single instrumentation might use multiple instrumentation points - e.g. servlet filer and jaxrs interceptors.

This could be an iteration after #107 - if we allow a globally scoped tracer.

@carlosalberto
Copy link
Contributor

Something I'd be quite interested to know, is how common an application needs such specific scenario. I remembering seeing this pattern in HTrace, but haven't seen it much in other places. Depending on that, I'd be up to putting this in out contrib/ artifact (as opposed to put it in the main API side).

@yurishkuro
Copy link
Member

Love the use case. However, I'm not sure tracer-per-component is necessary to address that. For example, if instrumentation follows the OpenTracing conventions and sets the "component" tag on span builder, then the tracer can compare it with configuration and decide to return a noop span with span context equal to the parent.

@tedsuo
Copy link
Contributor

tedsuo commented Apr 29, 2019

We currently name components in OpenTracing via a tag:

Span tag name Type Notes and examples
component string The software package, framework, library, or module that generated the associated Span. E.g., "grpc", "django", "JDBI".

I agree there should be a component scope, and it should be handled more directly – the tag approach was due to backwards compatibility, etc, and is not the best.

This concept is more related to scope management and span creation; I don't feel it is related to the global tracer, or tracer instantiation. Tracers are a mechanism for recording data and exporting it - ideally, there should be only one. We should not conflate instantiating the export mechanism with name-spacing and scoping the current operations.

Likewise, I would prefer not to conflate this kind of namespacing to a particular language feature, such as classes or functions – it should be a cross-language concept.

This still leaves the question open as to where to put this information. I suggest that this information is part of the Span or SpanContext, and should be constructed there. Spans inherit their parents component unless it is overridden during construction.

@bogdandrutu
Copy link
Member

Am I understanding correctly that we want to have a notion of component at the SpanBuilder time is ok, I think this achieves the end result of having Spans grouped by the component.

@tedsuo I don't think we should propagate this between spans, maybe require instrumentation to set this when they create new Spans.

@yurishkuro
Copy link
Member

I can see pros and cons in inheriting component from parent span, but I think the con is bigger: if another layer doesn't set component then inheriting it is unexpected behavior. So I'd rather have it set on every span, which isn't too onerous for middleware for a given library/framework.

@bogdandrutu
Copy link
Member

Also a contrib TracerWithComponent implements Tracer that simply set the component when a SpanBuilder is return is easy to offer.

@tedsuo
Copy link
Contributor

tedsuo commented May 1, 2019

@yurishkuro I'm not actually sure what the current inheritance behavior is with the component tag in OT.

@bogdandrutu that might be a good way to model the behavior @discostu105 was asking for. We could move it into core if it looks like a good pattern. Ideally, I would like "best practices" to not require contrib projects.

@carlosalberto
Copy link
Contributor

I agree with @yurishkuro on not inheriting component from the parent Span. Doing so could get tricky eventually, and I'd stick with simplicity for now.

@tedsuo
Copy link
Contributor

tedsuo commented May 1, 2019

This is a bit broader, but relevant. Please bear with me:

In general, each trace event can be modeled as if it is occurring in a nested namespace. So far, I have seen a lot of interest in the following namespaces:

Trace::Service::Component::Operation

Currently state of affairs (correct me if I am wrong):

  • The trace is modeled as a SpanContext, with an ID but no name or attributes.
  • The service is modeled as a Tracer, with attributes but no name or ID.
  • The component is proposed to be modeled as a Span property, with a name but no ID or attributes.
  • The operation is modeled as a Span, with a name, ID, and attributes.

In the long run, I suspect we want something like Name, ID, and Attributes for each namespace.

In the short run, I am hearing a lot of interest in attributes for components. Only having them on the span and the service is proving cumbersome for real world usage.

So perhaps we should think deeper about this subject.

@austinlparker
Copy link
Member

Would some sort of ComponentScope that can be passed into the span builder help here?

@yurishkuro
Copy link
Member

I am not sure Name/ID/Attributes applies to all 4 entities:

  • Name for a trace is quite vague
  • ID for service/component is counterproductive (high cardinality)
  • Attributes make sense across all four entities (realizing that operation == span), however semantically different: static for service/component, dynamic for trace/span.

service and component sound like two different levels of resource

@tedsuo
Copy link
Contributor

tedsuo commented May 2, 2019

Thanks, I agree @yurishkuro, it's not clear that this needs to be fully normalized. But we have some holes, especially around attributes and the component scope.

@pavolloffay
Copy link
Member

Instead of

Tracer tracer = OpenConsensus.Tracing.getTracer("io.opensomething.contrib.mongo"); we could simply have one global instance

Tracer tracer = OpenConsensus.Tracing.getTracer(); hen tracer could be converted something like a builder which would modify some of the configurations - tags, exporter, sampling.

@jmacd
Copy link
Contributor

jmacd commented May 7, 2019

I think developers want an API that lets them write concise instrumentation code, so that the weight of instrumentation doesn't overwhelm the code itself. The question is how individual modules of code will (concisely) declare their component in the source, at that point where spans are built.

This suggests making it possible to build spans and supply very few "component" arguments, potentially just one argument at the call site. In my own code, I've accomplished this by bundling a set of attributes, including the component concept, into a single object that's passed to the span builder. Importantly, this includes more than one attribute potentially.

The pattern generally involves some heavy-weight resource that is loaded into memory. It's part of a component, sure, but it also includes a number of other attributes that specifically describe an instance of the component being invoked at the point where a span is created. The code could read like:

class MyClass {
  TraceComponent traceComponent;
  
  MyClass() {
    traceComponent = OpenTelemetry.Tracing.getTracer().componentBuilder("this_is_my_class").
                                                withAttribute("key1", "value1").
                                                withAttribute("key2", "value2");
    // ...
  }

  void someMethod(String arg) {
     Span span = this.traceComponent.spanBuilder("someMethod").withAttribute("arg", arg).startSpan()
     // ...
  }
};

The component builder would be a subset of the span builder API, merely allowing attribute names and setting the standard component attribute, for use in building spans elsewhere in the same module of code.

@pavolloffay
Copy link
Member

It is an interesting pattern, I think the instrumentation could just store SpanBuilder and reuse it on repetitive invocations (not sure if this API allow it). Similar things people do with request builders. it makes sense to pre-bake it as the attributes do not change.

I liked OpenConsensus.Tracing.getTracer("io.opensomething.contrib.mongo");because it forces to supply the component at the beginning. The tag on a span makes it optional.

@tedsuo
Copy link
Contributor

tedsuo commented May 8, 2019

Thanks @jmacd. I discussed this with @pavolloffay, @bogdandrutu, and @carlosalberto.

An updated proposal is as follows:

class MyLibrary {
  Tracer tracer;
  
  MyLibrary() {
    tracer = OpenTelemetry.getTracer().toBuilder("this_is_my_class").
                                      setResource(resource).
                                      build();
    // ...
  }

  void someMethod(String arg) {
     Span span = this.tracer.spanBuilder("someMethod").setAttribute("arg", arg).startSpan()
     // ...
  }
};

Main Notes:

  • OpenTelemetry.getTracer() returns a Tracer. This Tracer is the base "Service Tracer" which was instantiated via a service loader, and contains the name and resources for the entire service.
  • OpenTelemetry.getTracer().toBuilder() returns a TracerBuilder which allows a component (framework, mysql client, RPC library) to add a component name, and add additional resources which are associated only with that component, and not the rest of the service. This is considered a "Component Tracer."

Additional Notes:

  • We don't think we need an intermediary OpenTelemetry.Tracing class. The OpenTelemetry class can directly contain tracing, metrics, and tag methods.
  • We'd like to remove the setResource method currently available on Tracer. Resources should only be set on tracer creation.

@tedsuo
Copy link
Contributor

tedsuo commented May 8, 2019

(BTW I don't know why Github keeps putting @pavolloffay's comment at the bottom of this thread, but it is out of order.)

@bogdandrutu
Copy link
Member

@tedsuo see the attachment :))), @pavolloffay came from the future :)))
Pavol_Is_In_The_Future

@jmacd
Copy link
Contributor

jmacd commented May 8, 2019

I like it, @tedsuo. I always focus on concision, so I wonder if a ResourceBuilder might help with readability. For example, to construct a resource in the same line of code, you might like to write:

tracer = OpenTelemetry.getTracer().toBuilder("this_is_my_class").
                                                                setResource(Resource.ofEntries(
                                                                       entry("key1", "value1"),
                                                                       entry("key2", "value2")
                                                               )).build();

@bogdandrutu bogdandrutu transferred this issue from open-telemetry/opentelemetry-java May 13, 2019
@SergeyKanzhelev SergeyKanzhelev added this to the API complete milestone Jun 3, 2019
@bogdandrutu
Copy link
Member

@jmacd that is not really a ResourceBuilder but I think offering a simpler API to create Resource (or even a real builder) should be on our list.

@alramlechner
Copy link

alramlechner commented Jun 7, 2019

From our (Dynatrace) perspective, the component should not be optional. OTel users should be enforced to provide a component name.
I would change the sample posted by @tedsuo therefore in to following way:

tracer = OpenTelemetry.getTracer("io.opensomething.contrib.mongo").toBuilder(). setResource(resource). build();

of course: the whole builder is optional from component perspective.

@SergeyKanzhelev SergeyKanzhelev modified the milestones: API proposal done, API revision: 07-2019 Jun 11, 2019
@discostu105
Copy link
Member Author

discostu105 commented Jun 14, 2019

I think there was a bit of a mixup of my intents of this "component" (sorry for late reply). Maybe the name "component" is misleading. I actually did not mean to identify "the component being traced", but rather "the tracing-implementation". That's why the existing "component" tag from OpenTracing is not really what I meant here.

So, IMHO this should not be conflated with "resources", as these are in my understanding, rather identifying something like a host/process/service/library.

Knowing the instrumentation-component for every span would allow our agent to determine which instrumentation-components are active in an application at runtime. Based on that we could, at runtime, decide which instrumentation-components we would actually use. This could be configuration based (user opt-in per instrumentation-component) or we could disable specific instrumentations (e.g. faulty/bad instrumentation) based on such component information.

Another use-case is, we could identify certain (well known) instrumentation-components that would overlap with our own already built-in auto-instrumentation and auto-blacklist them (or let the user decide which instrumentation to use) to avoid "double-tracing".

Such concerns have also been brought up recently in the Auto-Instrumentation SIG (see https://docs.google.com/document/d/1sovSQIGdxXtsauxUNp4qUMEIJZzObdukzPT52eyPCHM/edit#heading=h.dcogdcm13u12)

A way to discover and avoid potential conflicts/overlap/redundancy between explicit whitebox instrumentation and blackbox instrumentation of the same libraries/packages

  • That is, if a developer has already added the “official” OpenTelemetry plugin for, say, gRPC, then when the blackbox instrumentation effort adds gRPC support, it should not “double-instrument” it and create a mess of extra spans/etc

@AloisReitbauer AloisReitbauer self-assigned this Jul 29, 2019
@AloisReitbauer
Copy link

Trying to get the Gist of the discussion again. We need to know two additional things about each span that gets generated:

  • component representing the traced component like grpc, jdbc, ...
  • instrumentation-provider this represent the actual instrumentation or contrib library and is needed to turn off instrumentation in case there are problems with it.

In order to get this information into a span we have three options:

  • pass it to a Span at every call. This is error-prone and chatty
  • pass it to the span builder. This would help. However, a SpanBuilder has the same lifetime as a span so this would not really help
  • pass it to the component that creates a SpanBuilder. This would mean passing it to the Tracer. this sounds like the most reasonable approach. However, it would not work today because tracers are references to a global tracer object.

We could overcome this problem by creating a TracerFactory or alternatively a TracerBuilder as mentioned above. This component would be configured with the required parameters and ensure they are passed to the SpanBuilder and eventually the Span.

Does this capture the problem and proposed solutions properly?

@tedsuo
Copy link
Contributor

tedsuo commented Aug 6, 2019

@AloisReitbauer this does capture the issue well. Modeling a service as a set of resources, and then each component as a nested set of resources feels correct.

The only complication is how the Tracer object relates context propagation. Both traces and metrics will want to be labelled with this information. So there's still some work to be done, in general, to cleanly separate the context propagation layer from the observability layer.

@lizthegrey
Copy link
Member

+1 that this would be extremely useful to us over at Honeycomb - cc @maplebed

We sometimes like to write to more than one distinct dataset or even telemetry processing environment from some of our internal tools.

@discostu105
Copy link
Member Author

@z1c0 wrote an RFC that covers this issue: open-telemetry/oteps#16

@Oberon00
Copy link
Member

Oberon00 commented Oct 1, 2019

I think this can be closed once #264 is merged.

@tedsuo
Copy link
Contributor

tedsuo commented Oct 4, 2019

Closing this now. Thank you all, for taking this over the finish line.

@tedsuo tedsuo closed this as completed Oct 4, 2019
TuckTuckFloof pushed a commit to TuckTuckFloof/opentelemetry-specification that referenced this issue Oct 15, 2020
rockb1017 pushed a commit to rockb1017/opentelemetry-specification that referenced this issue Nov 18, 2021
- Move CoC and contributing to templates directory
- Create security template
- Modify repository to address feedback

Addresses open-telemetry#9 open-telemetry#10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests