Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[connector/spanmetrics] [processor/tailsampling] Tail sampling phases #30320

Closed

Conversation

tiithansen
Copy link

@tiithansen tiithansen commented Jan 6, 2024

Description: Tail sampling phases

Link to tracking Issue: Issue 30319

Testing: Adjusted spanmetricsconnector tests

Documentation: Documented new properties and their defaults in README.md

@tiithansen tiithansen force-pushed the tail_sampling_phases branch from ff304d0 to c5e6828 Compare January 13, 2024 13:44
component: connector/spanmetrics, processor/tail_sampling

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: "Feature to support pre and post tail sampling phases, so other processors or connectors could see which traces will be exported."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need an update to reflect the latest changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes senses to split the PR into two, or provide two updates into the change log.

@@ -231,6 +245,14 @@ type Config struct {
// ExpectedNewTracesPerSec sets the expected number of new traces sending to the tail sampling processor
// per second. This helps with allocating data structures with closer to actual usage size.
ExpectedNewTracesPerSec uint64 `mapstructure:"expected_new_traces_per_sec"`
// Processor mode to configure tailsampling phase
// during presample phase traces are marked as sampled but not dropped
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the words you used above, "pre-sampling stage".

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea in general, but I have questions about the implementation.

  • Marking all spans with the sampling decision doesn't look right to me,
  • Using the filter processor to drop spans that are not sampled sounds like a misuse of that processor: we are marking all spans to be dropped, when we really mean that a trace should be dropped. While the implications are the same, I feel like the solution is more like a hack than pieces working correctly together.
  • There has to be a processor after this to remove the attribute from the spans

In any case, I would appreciate a review from the SIG Sampling folks, especially @jmacd and @kentquirk.

cumulative = "AGGREGATION_TEMPORALITY_CUMULATIVE"
delta = "AGGREGATION_TEMPORALITY_DELTA"
cumulative = "AGGREGATION_TEMPORALITY_CUMULATIVE"
dynamicExemplarAttributeKeyName = "tail_sampling.sampled"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the folks from SIG Sampling should have a say on this feature as a whole, but especially on this attribute, as it has the potential for becoming a semconv to be used by other components.

cc @jmacd, @kentquirk

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute is configurable so I don't see a reason why to introduce it as a convention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute is configurable so I don't see a reason why to introduce it as a convention.

It's more if we introduce the notion of a sampling key, does it align with what other groups have envisioned or can it be used across more projects.

error_mode: ignore
traces:
span:
- attributes["tail_sampling.sampled"] == nil # Drop spans that are not sampled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sampling decision is trace-based, not span-based. Either a whole trace is dropped, or nothing from the trace is dropped. I understand this specific component (filter) does span filtering, but I believe adding this as the main example will confuse users.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say all samplings are span based. Tail-sampling itself does not guarantee that whole trace will be sampled or not. We can never put a strong guarantee that specific trace wont be sampled but we can guarantee that a specific span wont be sampled. Also there is no way to attach decision on a trace as all processors operate on a span level anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a collection of spans in memory, all of the spans belonging to the same trace will either be stored or discarded. Sure, late arriving spans might get a different sampling decision by the tail-sampling processor, but given the right conditions, the decision is consistent.

Also there is no way to attach decision on a trace as all processors operate on a span level anyway.

If you are storing traces (collections of spans belonging to the same trace ID) in memory in the first phase, you can certainly add the sampling decision to the root span only, instead of all spans in that tree.

But I'll defer to other folks with opinions related to sampling (@jmacd and @kentquirk).

Copy link
Contributor

github-actions bot commented Feb 9, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added Stale and removed Stale labels Feb 9, 2024
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

Copy link
Contributor

@MovieStoreGuy MovieStoreGuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to the code are solid, however, this need more consideration on how this fits into the wider process of including sampling decisions across the project(s).

I believe @jpkrohling has already pinged the appropriate people.

cumulative = "AGGREGATION_TEMPORALITY_CUMULATIVE"
delta = "AGGREGATION_TEMPORALITY_DELTA"
cumulative = "AGGREGATION_TEMPORALITY_CUMULATIVE"
dynamicExemplarAttributeKeyName = "tail_sampling.sampled"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute is configurable so I don't see a reason why to introduce it as a convention.

It's more if we introduce the notion of a sampling key, does it align with what other groups have envisioned or can it be used across more projects.

MaxPerDataPoint *int `mapstructure:"max_per_data_point"`
Enabled bool `mapstructure:"enabled"`
MaxPerDataPoint *int `mapstructure:"max_per_data_point"`
DynamicConfig *ExemplarDynamicExportConfig `mapstructure:"dynamic"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this as a pointer doesn't provide much value due to the fact that ExamplarDynamicExportConfig has the field Enabled which will be false as the zero value.

component: connector/spanmetrics, processor/tail_sampling

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: "Feature to support pre and post tail sampling phases, so other processors or connectors could see which traces will be exported."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes senses to split the PR into two, or provide two updates into the change log.

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Mar 18, 2024
Copy link
Contributor

github-actions bot commented Apr 2, 2024

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants