Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support decide sampling in all span lifecycle #4171

Open
hmleo opened this issue Aug 1, 2024 · 1 comment
Open

Support decide sampling in all span lifecycle #4171

hmleo opened this issue Aug 1, 2024 · 1 comment
Labels
spec:trace Related to the specification/trace directory triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted

Comments

@hmleo
Copy link

hmleo commented Aug 1, 2024

In the current header sampling mechanism, the samplingResult is depend on Instrumenter.start(), such as parent or traceIdRatio.

When the samplingResult create ends, there may be an invocation exception in instrument target method, and exception is usually the information we focus on. so, i think if we can support sample decide in span lifecycle(e.g. create / set attributes / end),it will be perfect.

for example, two scenes as below:
1.tail sampling: got 100% spans in one trace. but if we want sample all error spans, it require 100% sample.
2.new head sampling: support decide samle on span lifecycle in agent. Do not require 100% sample, it may lose some other service's spans in whole trace,may got only 50%spans in one trace, but the 50% is important. I think it's acceptable

So, can it support decide sample in all span lifecycle?

@hmleo hmleo added the spec:trace Related to the specification/trace directory label Aug 1, 2024
@danielgblanco
Copy link
Contributor

danielgblanco commented Aug 5, 2024

Thanks @hmleo. Trace sampling aims to result in complete traces (or at least complete subtraces). If the sampling decision is postponed to any point after span creation (e.g. when the instrumented method raises an exception) then there would be no guarantees for trace completeness. This would not only result in missing spans at the root of a sub-trace, but also within it. For instance, consider this case where we have parent-based samplers configured:

Span_A (not sampled)
|____Span_AA (originally not sampled as parent not sampled, then sampled after exception)
|         |____Span_AAA (not sampled as parent not sampled)
|         |________________Span_AAB (sampled after sampling decision changed of parent changed)
|________________Span_AB (not sampled as parent not sampled)

If we allowed the sampling decision to be changed after span creation, this would be represented as:

Span_AA (orphan span)
|________________Span_AAB

As you can see, this would not only result in Span_AA missing its parent (which may be acceptable) but also in some information loss under Span_AA which may be critical to build a series of events, especially if the output from the operation represented as Span_AAA is used within the operation represented as Span_AAB. This scenario (missing leaf spans) would also be challenging to identify, as we wouldn't know what spans are missing.

Having completeness guarantees is important, and something that Consistent probability sampling aims to solve for head-based samplers (for a different use case). For your use case, the recommendation is to use out-of-process sampling mechanisms, like the tailsampling processor in the Collector. This would allow to generate complete traces.

Let me know if this solves your needs. In any case, I'll leave this up for the community to give more feedback to be considered as part of this issue.

@danielgblanco danielgblanco added the triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted label Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:trace Related to the specification/trace directory triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted
Projects
None yet
Development

No branches or pull requests

2 participants