Overriding Decisions of the Built-in Trace Samplers via a "sample.priority" Like Span Attribute #3725
Replies: 13 comments 3 replies
-
Both this and #530 seem like good ideas. There are going to be varied use cases that require varied APIs:"
|
Beta Was this translation helpful? Give feedback.
-
Thank you. Turn tracing on/off or do no nothing : could be "sample.priority" 1/0/not specified Really those are the only two things that make sense for the default tracers. |
Beta Was this translation helpful? Give feedback.
-
The Sampling SIG meets weekly on Thursdays at 8am PST. There is already a precedent for the thing you're asking for and it would be fairly easy to see adding this to the spec. |
Beta Was this translation helpful? Give feedback.
-
I do not think that this proposal, as written, is a good idea. The issue with it is that is would allow the application to force tracing on or off when starting a Span, overriding any decisions made by the (default) samplers. However, sampling priority (specified by the application) makes sense as a hint. For example, TraceIdRatioBased sampler can use different thresholds depending on sampling priority. |
Beta Was this translation helpful? Give feedback.
-
I see your point about allowing to negatively influence head sampling decisions being a problem. These features could be gated behind another agent system property/environment variable, like "allow sampling rate/decision override" or added in new Samplers. Alternatively/Additionally, would it be acceptable if these hints were only used to turn ON sampling ? For my (and I imagine most) use case, it's perfectly to fine to use the default policies for the negative case, what I really need is selectively enabling sampling. I can imagine a lot of cases where an application/service would want to make sure to get traces of some operations, but handling that in tail samplers would cause too high a load on the tracing infrastructure (as this would require the AlwaysOn head sampler). |
Beta Was this translation helpful? Give feedback.
-
Even if we limit the hints to encourage/force sampling, such decisions can conflict with the sampling policy used by the actually configured sampler. Would you expect that AllwaysOffSampler makes exception and samples spans with the sampling.priority attribute set to 1? What about any rate limiting sampler whose mandate is to sample no more than N spans per time unit? I think it is better to develop a special sampler which looks for the sampling.priority attribute and makes final sampling decision if it is present, and delegates the decision to a fallback sampler if the attribute is missing. |
Beta Was this translation helpful? Give feedback.
-
Basically yes. That behaviour could be could be gated behind a sampler config parameter.
Custom samplers would be free to ignore either hint, or re-interpret it as needed.
That is totally fine by me. This modified proposal would be (just the logic, it would not necessarily be implemented in separate classes): This could be chained similarly to the parentBased logic, sou you could have these new combinations: Hinted > parentBased > Ratio/alwaysOn/AlwaysOn: ParentBased > Hinted > Ratio/alwaysOn/AlwaysOn: Hinted > Ratio/alwaysOn/AlwaysOn A slight problem is configuring it. |
Beta Was this translation helpful? Give feedback.
-
Yes, I think putting the new functionality in a single sampler (HintedSampler) is the cleanest solution. |
Beta Was this translation helpful? Give feedback.
-
@stoty can you take a look at #3205, this sounds like a very similar discussion (but I might miss the nuances between your proposal and what has been discussed there) |
Beta Was this translation helpful? Give feedback.
-
For reference: This works fine, though having to build, ship and specify a separate OTEL agent extension jar with Phoenix is not ideal. |
Beta Was this translation helpful? Give feedback.
-
Based on this dicussion, I feel that most people agree that something like this would be useful, but a desire for a very generic solution to solve all span pruning and customization needs gets in the way of simple fix for this specific feature gap. Would it help if I made a PR for the java SDK that adds the hinting feature based on the discussion above ? |
Beta Was this translation helpful? Give feedback.
-
Not in itself. open-telemetry/oteps#240 is about configuring and composing existing samplers, so it would help with defining how the hintable sampler is used for the sampling decision (first, last, only for root spans, only if matches somehting, etc), but but it does not seem to include the hintable sampler itself, neither does it define a standard hint/attribute for applications to configure it. AFAICT the hintable sampler itself would still have to be provided by the application separately, and the hint attribute would still be application specific. Going through the links in that document, now I think that RuleBasedRoutingSampler could ALMOST work for my use case, and could be configured to use for example "Parent Based Always" for hinted spans, and "Parent based TraceIdRatioBasedSampler" for non-hinted ones. The problem is that the only way to configure that is from Java code. However, I still feel that defining a (some) standard hint(s), and including a mechanism to enable it (them) in the main (non-contrib) agent would be a useful addition. |
Beta Was this translation helpful? Give feedback.
-
Short term, I could add a few SPI provider classes like https://github.com/open-telemetry/opentelemetry-java-contrib/blob/main/samplers/src/main/java/io/opentelemetry/contrib/sampler/LinksParentAlwaysOnSamplerProvider.java to implement some of the hinted sampler combinations I have mentioned above. |
Beta Was this translation helpful? Give feedback.
-
I propose adding a feature to the built in Sampler classes (AlwaysOn, AlwaysOff, TraceIdRatioBased) which would allow the sampling decision can be overriden by setting a Span attribute.
This would let users force tracing on or off when starting a Span, whitout writing and registering a custom Sampler.
span = tracer.spanBuilder("configurable").setAtttribute("sample.priority", 1).startSpan().
This is somehwat similar to #530 , but the objective here is not
pausing/terminating traces inside the chain, but overriding the sampling flag from code when creating the root span.
Background:
I am trying to migrate Apache Phoenix from HTrace to OpenTelemetry tracing.
Apache Phoenix is an SQL Layer over HBase. HBase has built-in support for OpenTelemetry.
It depends on opentelemetry-api, and by default expects the implementation to be provided by the OpenTelemetry java agent, and configured via its environment variables.
One of my goals is making the tracing configurable from the JDBC API and the statement text.
Either on the connection level by setting a propery, or adding a JDBC URL parameter to
https://docs.oracle.com/javase/8/docs/api/java/sql/DriverManager.html#getConnection-java.lang.String-java.util.Properties-
Or on a query by query basis with a query hint like
SELECT /*+ TRACE=ON */ * FROM TABLE
Doing this is harder than I expected.
The Trace API doesn't seem to allow setting sampling explicitly, only via the Sampler object, which is not available in opentelemetry-api, only in opentelemetry-sdk.
I have looked at using the "parentbased_always_on" built-in sampler, and simply not starting a Span from Phoenix if don't want to enable tracing. However, this wouldn't work, because the underlaying HBase client library would start sampled spans for each call into it anyway, which is not what I need.
So I need a way to start either sampling on non-sampling Spans based on information in Phoenix.
The way to do it seems to be writing a custom sampler, setting some attribute on the Span, and then checking this in the Sampler to make the decision.
I believe that this a viable soultion, but I'd need to add a compile-time dependency on opentelemetry-sdk for this, and I am afraid that adding my custom Sampler to an agent based setup may cause unforeseen complications.
Searching for solutions, I came across the sample.priority semantic attribute for Opentracing, which seems to do what I want, but I could not find a similar feature in OpenTelemetry
Looking at TraceIdRatioBasedSampler, it would be super easy to add an override based on a Span attribute, all it takes is a hash lookup and a comparison.
The only hard part is that ideally this should go into the specification, and be added for each language SDK, which is a lot of changes in a lot of languages.
Beta Was this translation helpful? Give feedback.
All reactions