[connector/spanmetrics] [processor/tailsampling] Tail sampling phases #30320

tiithansen · 2024-01-06T16:19:06Z

Description: Tail sampling phases

Link to tracking Issue: Issue 30319

Testing: Adjusted spanmetricsconnector tests

Documentation: Documented new properties and their defaults in README.md

…m could leverage the information about whether the trace is going to be sampled or not

…on attribute value

portertech · 2024-01-22T19:36:53Z

.chloggen/tail_sampling_phases.yaml

+component: connector/spanmetrics, processor/tail_sampling
+
+# A brief description of the change.  Surround your text with quotes ("") if it needs to start with a backtick (`).
+note: "Feature to support pre and post tail sampling phases, so other processors or connectors could see which traces will be exported."


Does this need an update to reflect the latest changes?

I think it makes senses to split the PR into two, or provide two updates into the change log.

portertech · 2024-01-22T19:42:07Z

processor/tailsamplingprocessor/config.go

@@ -231,6 +245,14 @@ type Config struct {
 	// ExpectedNewTracesPerSec sets the expected number of new traces sending to the tail sampling processor
 	// per second. This helps with allocating data structures with closer to actual usage size.
 	ExpectedNewTracesPerSec uint64 `mapstructure:"expected_new_traces_per_sec"`
+	// Processor mode to configure tailsampling phase
+	// during presample phase traces are marked as sampled but not dropped


I prefer the words you used above, "pre-sampling stage".

jpkrohling

I like this idea in general, but I have questions about the implementation.

Marking all spans with the sampling decision doesn't look right to me,
Using the filter processor to drop spans that are not sampled sounds like a misuse of that processor: we are marking all spans to be dropped, when we really mean that a trace should be dropped. While the implications are the same, I feel like the solution is more like a hack than pieces working correctly together.
There has to be a processor after this to remove the attribute from the spans

In any case, I would appreciate a review from the SIG Sampling folks, especially @jmacd and @kentquirk.

jpkrohling · 2024-01-25T12:32:59Z

connector/spanmetricsconnector/config.go

-	cumulative = "AGGREGATION_TEMPORALITY_CUMULATIVE"
+	delta                           = "AGGREGATION_TEMPORALITY_DELTA"
+	cumulative                      = "AGGREGATION_TEMPORALITY_CUMULATIVE"
+	dynamicExemplarAttributeKeyName = "tail_sampling.sampled"


I think the folks from SIG Sampling should have a say on this feature as a whole, but especially on this attribute, as it has the potential for becoming a semconv to be used by other components.

cc @jmacd, @kentquirk

This attribute is configurable so I don't see a reason why to introduce it as a convention.

This attribute is configurable so I don't see a reason why to introduce it as a convention.

It's more if we introduce the notion of a sampling key, does it align with what other groups have envisioned or can it be used across more projects.

jpkrohling · 2024-01-25T12:34:31Z

processor/tailsamplingprocessor/README.md

+    error_mode: ignore
+    traces:
+      span:
+      - attributes["tail_sampling.sampled"] == nil # Drop spans that are not sampled


A sampling decision is trace-based, not span-based. Either a whole trace is dropped, or nothing from the trace is dropped. I understand this specific component (filter) does span filtering, but I believe adding this as the main example will confuse users.

I would say all samplings are span based. Tail-sampling itself does not guarantee that whole trace will be sampled or not. We can never put a strong guarantee that specific trace wont be sampled but we can guarantee that a specific span wont be sampled. Also there is no way to attach decision on a trace as all processors operate on a span level anyway.

If we have a collection of spans in memory, all of the spans belonging to the same trace will either be stored or discarded. Sure, late arriving spans might get a different sampling decision by the tail-sampling processor, but given the right conditions, the decision is consistent.

Also there is no way to attach decision on a trace as all processors operate on a span level anyway.

If you are storing traces (collections of spans belonging to the same trace ID) in memory in the first phase, you can certainly add the sampling decision to the root span only, instead of all spans in that tree.

But I'll defer to other folks with opinions related to sampling (@jmacd and @kentquirk).

github-actions · 2024-02-09T05:19:08Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions · 2024-02-29T05:19:44Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

MovieStoreGuy

The changes to the code are solid, however, this need more consideration on how this fits into the wider process of including sampling decisions across the project(s).

I believe @jpkrohling has already pinged the appropriate people.

MovieStoreGuy · 2024-03-04T01:08:07Z

connector/spanmetricsconnector/config.go

-	cumulative = "AGGREGATION_TEMPORALITY_CUMULATIVE"
+	delta                           = "AGGREGATION_TEMPORALITY_DELTA"
+	cumulative                      = "AGGREGATION_TEMPORALITY_CUMULATIVE"
+	dynamicExemplarAttributeKeyName = "tail_sampling.sampled"


This attribute is configurable so I don't see a reason why to introduce it as a convention.

It's more if we introduce the notion of a sampling key, does it align with what other groups have envisioned or can it be used across more projects.

MovieStoreGuy · 2024-03-04T01:10:04Z

connector/spanmetricsconnector/config.go

-	MaxPerDataPoint *int `mapstructure:"max_per_data_point"`
+	Enabled         bool                         `mapstructure:"enabled"`
+	MaxPerDataPoint *int                         `mapstructure:"max_per_data_point"`
+	DynamicConfig   *ExemplarDynamicExportConfig `mapstructure:"dynamic"`


Having this as a pointer doesn't provide much value due to the fact that ExamplarDynamicExportConfig has the field Enabled which will be false as the zero value.

MovieStoreGuy · 2024-03-04T01:30:24Z

.chloggen/tail_sampling_phases.yaml

+component: connector/spanmetrics, processor/tail_sampling
+
+# A brief description of the change.  Surround your text with quotes ("") if it needs to start with a backtick (`).
+note: "Feature to support pre and post tail sampling phases, so other processors or connectors could see which traces will be exported."


I think it makes senses to split the PR into two, or provide two updates into the change log.

github-actions · 2024-03-18T05:19:59Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions · 2024-04-02T05:18:58Z

Closed as inactive. Feel free to reopen if this PR is still being worked on.

tiithansen requested a review from jpkrohling as a code owner January 6, 2024 16:19

tiithansen requested a review from a team January 6, 2024 16:19

github-actions bot assigned MovieStoreGuy Jan 6, 2024

github-actions bot added connector/spanmetrics processor/tailsampling Tail sampling processor labels Jan 6, 2024

tiithansen added 4 commits January 13, 2024 15:28

feat: Implement pre & post processing modes so proccesors between the…

29075a7

…m could leverage the information about whether the trace is going to be sampled or not

feat: Implement feature to detect if exemplar should be stored based …

06bc35d

…on attribute value

Document new configuration properties and add changelog

23e1c92

Change feature according to discussion from the linked issue

c5e6828

tiithansen force-pushed the tail_sampling_phases branch from ff304d0 to c5e6828 Compare January 13, 2024 13:44

portertech reviewed Jan 22, 2024

View reviewed changes

jpkrohling reviewed Jan 25, 2024

View reviewed changes

github-actions bot added Stale and removed Stale labels Feb 9, 2024

github-actions bot added the Stale label Feb 29, 2024

MovieStoreGuy removed the Stale label Mar 4, 2024

MovieStoreGuy reviewed Mar 4, 2024

View reviewed changes

github-actions bot added the Stale label Mar 18, 2024

github-actions bot closed this Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[connector/spanmetrics] [processor/tailsampling] Tail sampling phases #30320

[connector/spanmetrics] [processor/tailsampling] Tail sampling phases #30320

tiithansen commented Jan 6, 2024 •

edited

Loading

portertech Jan 22, 2024

MovieStoreGuy Mar 4, 2024

portertech Jan 22, 2024

jpkrohling left a comment

jpkrohling Jan 25, 2024

tiithansen Feb 10, 2024

MovieStoreGuy Mar 4, 2024

jpkrohling Jan 25, 2024

tiithansen Feb 10, 2024

jpkrohling Feb 14, 2024

github-actions bot commented Feb 9, 2024

github-actions bot commented Feb 29, 2024

MovieStoreGuy left a comment

MovieStoreGuy Mar 4, 2024

MovieStoreGuy Mar 4, 2024

MovieStoreGuy Mar 4, 2024

github-actions bot commented Mar 18, 2024

github-actions bot commented Apr 2, 2024

[connector/spanmetrics] [processor/tailsampling] Tail sampling phases #30320

[connector/spanmetrics] [processor/tailsampling] Tail sampling phases #30320

Conversation

tiithansen commented Jan 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpkrohling left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Feb 9, 2024

github-actions bot commented Feb 29, 2024

MovieStoreGuy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Mar 18, 2024

github-actions bot commented Apr 2, 2024

tiithansen commented Jan 6, 2024 •

edited

Loading