From e2c3ef69d20c58bdab0d374f194dfdadba573917 Mon Sep 17 00:00:00 2001 From: "J. Kalyana Sundaram" Date: Tue, 4 Jun 2024 17:57:34 -0700 Subject: [PATCH 1/8] Update 'r' value generation based on randomness flag + editorial changes --- .../0235-sampling-threshold-in-trace-state.md | 74 +++++++++++-------- 1 file changed, 42 insertions(+), 32 deletions(-) diff --git a/text/trace/0235-sampling-threshold-in-trace-state.md b/text/trace/0235-sampling-threshold-in-trace-state.md index 014744a73..776d849c7 100644 --- a/text/trace/0235-sampling-threshold-in-trace-state.md +++ b/text/trace/0235-sampling-threshold-in-trace-state.md @@ -1,45 +1,53 @@ # Sampling Threshold Propagation in TraceState +## Abstract + +Sampling is an important lever to reduce the costs associated with collecting and processing telemetry data. It enables you to choose a representative set of items from an overall population. + +There are two key aspects for sampling of tracing data. The first is that sampling decisions can be made independently for *each* span in a trace. The second is that sampling decisions can be made at multiple points in the telemetry pipeline. For example, the sampling decision for a span at span creation time could have been to **keep** that span, while the downstream sampling decision for the *same* span at a later stage (say in an external process in the data collection pipeline) could be to **drop** it. + +For both the above aspects, we want sampling decisions to be made in a **consistent** manner so that we can effectively reason about a trace. This OTEP describes a mechanism to achieve such consistent sampling decisions using a mechanism called **Consistent Probability Sampling**. As part of this, it proposes a mechanism for a common random value (R) and a rejection threshold (T). R is used to achieve consistent decisions, while T corresponds to a participant's sampling rate. This proposal describes how these values should be propagated and how participants should use them to make sampling decisions. + +This mechanism will enable creating a new set of samplers (Consistent Probability Samplers) that will enable trace participants to choose their own sampling rates, while still achieving consistent sampling decisions. This OTEP ensures that such samplers will interoperate with existing (non consistent probability) samplers. + ## Motivation -Sampling is a broad topic; here it refers to the independent decisions made at points in a distributed tracing system of whether to collect a span or not. Multiple sampling decisions can be made before a span is finally consumed. When sampling is to be performed at multiple points in the process, the only way to reason about it effectively is to make sure that the sampling decisions are **consistent**. -In this context, consistency means that a positive sampling decision made for a particular span with probability p1 implies a positive sampling decision for any span belonging to the same trace, if it is made with probability p2 >= p1. +Customers want to express arbitrary sampling probabilities such as 1%, 10%, and 75%. However, the existing experimental [specification for probability sampling using TraceState](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md) optimizes for powers of two probabilities. It supports non power of two sampling only using interpolation between powers of two. This approach is unnecessarily restrictive. Hence, we need an updated mechanism to support this capability. + +Further, there is a need for consistent sampling in the collection path (outside of the head-based sampling paths). To achieve consistent sampling decisions, the previous experimental spec required using a custom source of randomness (`r-value`). However, in such downstream sampling decisions, it can be expensive to reference this custom value from the tracestate attribute in every span. To improve this, this proposal makes use of the inherent randomness in the traceID as a less-expensive solution to achieve the same result. However, the new randomness flag introduced in the W3C TraceContext Level 2 specification can potentially be reset by trace participants on the Level 1 W3C TraceContext specification. Hence, there is need to still reference tracestate to check for the non existence of this custom random value before relying on traceid as the source of randomness. ## Explanation +Let's start with the definition for a consistent sampling decision. Consistency means that a positive sampling decision made for a particular span with probability p1 implies a positive sampling decision for any span belonging to the same trace if it is made with probability p2 >= p1. + +This proposal introduces a new value with the key `th` as an alternative to the `p` value in the previous specification. The `p` value is limited to powers of two, while the `th` value in this proposal supports a large range of values. -The existing, experimental [specification for probability sampling using TraceState](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md) is limited to powers-of-two probabilities, and is designed to work without making assumptions about TraceID randomness. -This system can only achieve non-power-of-two sampling using interpolation between powers of two, which is unnecessarily restrictive. -In existing sampling systems, sampling probabilities like 1%, 10%, and 75% are common, and it should be possible to express these without interpolation. -There is also a need for consistent sampling in the collection path (outside of the head-sampling paths) and using inherent randomness in the traceID is a less-expensive solution than referencing a custom `r-value` from the tracestate in every span. -This proposal introduces a new value with the key `th` as an alternative to the `p` value in the previous specification. -The `p` value is limited to powers of two, while the `th` value in this proposal supports a large range of values. -This proposal allows for the continued expression of randomness using `r-value` as specified there using the key `r`. -To distinguish the cases, this proposal uses the key `rv`. +This proposal allows for the continued expression of randomness using `r-value` as specified there using the key `r`. To distinguish the cases, this proposal uses the key `rv`. In the general case, in order to make consistent sampling decisions across the entire path of the trace, two values MUST be present in the `SpanContext`: 1. A _random_ (or pseudo-random) 56-bit value, called `R` below. -2. A 56-bit _rejection threshold_ (or just "threshold") as expressed in the TraceState, called `T` below. `T` represents the maximum threshold that was applied in all previous consistent sampling stages. If the current sampling stage applies a greater-valued threshold than any stage before, it MUST update (increase) the threshold correspondingly. +2. A 56-bit _rejection threshold_ (or just "threshold") as expressed in the TraceState, called `T` below. `T` represents the maximum threshold that was applied in all previous consistent sampling stages. If the current sampling stage applies a greater threshold value than any stage before, it MUST update (increase) the threshold correspondingly. -One way to think about _rejection threshold_ is that is the number of spans that would be discarded out of 2^56 considered spans. This means that spans where `R >= T` will be sampled. +One way to think about _rejection threshold_ is that it is the number of spans that would be discarded out of 2^56 considered spans. This means that spans where `R >= T` will be kept. -Here is an example involving three participants `A`, `B`, and `C`: +Here is an example involving three participating operations `A`, `B`, and `C`: `A` -> `B` -> `C` -where -> indicates a parent -> child relationship. +where -> indicates a parent to child relationship. `A` uses consistent probability sampling with a sampling probability of 0.25 (this corresponds to a rejection probability of .75). `B` uses consistent probability sampling with a sampling probability of 0.5. `C` uses a parent-based sampler. -When `A` samples a span, its outgoing traceparent will have the 'sampled' flag SET and the 'th' in its outgoing tracestate will be set to `0xc0_0000_0000_0000`. -When `A` does not sample a span, its outgoing traceparent will have the 'sampled' flag UNSET but the 'th' in its outgoing tracestate will still be set to `0xc0_0000_0000_0000`. -When B samples a span, its outgoing traceparent will have the 'sampled' flag SET and the 'th' in its outgoing tracestate will be set to `0x80_0000_0000_0000`. +When the sampling decision for `A` is to `keep` the span, its outgoing traceparent will have the 'sampled' flag SET and the 'th' in its outgoing tracestate will be set to `0xc0_0000_0000_0000`. +When the sampling decision for `A` is to `drop` the span, its outgoing traceparent will have the 'sampled' flag UNSET but the 'th' in its outgoing tracestate will still be set to `0xc0_0000_0000_0000`. +When the sampling decision for `A` is to `keep` the span, its outgoing traceparent will have the 'sampled' flag SET and the 'th' in its outgoing tracestate will be set to `0x80_0000_0000_0000`. C (being a parent based sampler) samples a span purely based on its parent (B in this case), it will use the sampled flag to make the decision. Its outgoing 'th' value will continue to reflect what it got from B (`0x80_0000_0000_0000`), and this is useful to understand its adjusted count. This design requires that as a given span progresses along its collection path, `th` is non-decreasing (and, in particular, must be increased at stages that apply lower sampling probabilities). -It does not, however, restrict a span's initial `th` in any way (e.g., relating it to that of its parent, if it has one). +It does not, however, restrict a span's initial `th` in any way. If a parent based consistent sampler is used, a span's initial `th` would be the same as its parent's `th` value, else it would be a new value based on the chosen sampling rate. + It is acceptable for B to have a lesser initial `th` than A has. It would not be ok if some later-stage sampler decreased A's `th`. The system has the following invariant: @@ -51,17 +59,19 @@ The sampling decision is propagated with the following algorithm: * If the `th` key is not specified, this implies that non-probabilistic sampling may be taking place. * Else derive `T` by parsing the `th` key as a hex value as described below. * If `T` is 0, Always Sample. -* Compare the 56 bits of `T` with the 56 bits of `R`. If `T > R`, then do not sample. +* Compare the 56 bits of `T` with the 56 bits of `R`. If `T > R`, then set the sampling decision to `drop`. The `R` value MUST be derived as follows: * If the key `rv` is present in the Tracestate header, then `R = rv`. -* Else if the Random Trace ID Flag is `true` in the traceparent header, then `R` is the lowest-order 56 bits of the trace-id. -* Else `R` MUST be generated as a random value in the range `[0, (2**56)-1]` and added to the Tracestate header with key `rv`. +* Else `R` is the lowest-order 56 bits of the trace-id. + +At the root span, the `R` value must be generated as follows: + +* If the new random flag in the `traceparent` is set, then there is no action required. In this case, the tracestate header will not have the `rv` key. For more info on this new flag, see [the W3C trace context specification](https://w3c.github.io/trace-context/#trace-id). +* If not, `R` MUST be generated as a random value in the range `[0, (2**56)-1]` and added to the Tracestate header with key `rv`. -The preferred way to propagate the `R` value is as the lowest 56 bits of the trace-id. -If these bits are in fact random, the `random` trace-flag SHOULD be set as specified in [the W3C trace context specification](https://w3c.github.io/trace-context/#trace-id). -There are circumstances where trace-id randomness is inadequate (for example, sampling a group of traces together); in these cases, an `rv` value is required. +Although less common, there are circumstances where trace-id randomness is inadequate (for example, when sampling a group of traces together); in these cases, an `rv` value is required. The value of the `rv` and `th` keys MUST be expressed as up to 14 hexadecimal digits from the set `[0-9a-f]`. For `th` keys only, trailing zeros (but not leading zeros) may be omitted. `rv` keys MUST always be exactly 14 hex digits. @@ -82,21 +92,21 @@ Sampling Decisions MUST be propagated by setting the value of the `th` key in th ## Initializing and updating T and R values -There are two categories of sampler: +There are two categories of samplers: - **Head samplers:** Implementations of [`Sampler`](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.29.0/specification/trace/sdk.md#sampler), called by a `Tracer` during span creation. -- **Downstream samplers:** Any component that, given an ended Span, decides whether to drop or forward ("sample") it on to the next component in the system. Also known as "collection-path samplers" or "sampling processors". _Tail samplers_ are a special class of downstream samplers that buffer the spans in a trace and select a sampling probability for the trace as a whole using data from any span in the buffered trace. +- **Downstream samplers:** Any component that, given an ended Span, decides whether to *drop* or *keep* it by forwarding it to the next component in the system. This category is also known as "collection path samplers" or "sampling processors". _Tail samplers_ are a special class of downstream samplers that buffer spans of a trace and make a sampling decision for the trace as a whole using data from any span in the buffered trace. This section defines behavior for each kind of sampler. ### Head samplers -A head sampler is responsible for computing the `rv` and `th` values in a new span's initial [`TraceState`](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.29.0/specification/trace/api.md#tracestate). Notable inputs to that computation include the parent span's trace state (if a parent span exists) and the new span's trace ID. +A head sampler is responsible for computing the `rv` and `th` values in a new span's initial [`TraceState`](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.29.0/specification/trace/api.md#tracestate). The main inputs to that computation include the parent span's trace state (if a parent span exists), the new span's trace ID, and possibly the trace flags (to know if the trace ID has been generated in a random manner). -First, a consistent `Sampler` decides which sampling probability to use. The sampler MAY select any value of T. If a valid `SpanContext` is provided in the call to `ShouldSample` (indicating that the span being created will be a child span), +First, a consistent probability `Sampler` may choose its own sampling rate. The higher the chosen sampling rate, the lower the rejection threshold (T). It MAY select any value of T. If a valid `SpanContext` is provided in the call to `ShouldSample` (indicating that the span being created will be a child span), -- Choosing a T greater than the parent span's is expected to result in partial traces (the parent may be sampled but its child, the current span, dropped). -- Choosing a T less than or equal to the parent span is expected to result in complete traces (this is definition of consistent probability sampling). +- Choosing a T greater than the parent span's T can result in partial traces. The parent span may be `kept` but it is possible that its child, the current span, may be dropped because of the lower sampling rate. At the same time, in case where the child span is `kept`, the parent span would have been to `keep` as well (meeting our consistent sampling goals) since the parent's sampling rate is greater than the child's sampling rate. +- Similarly, choosing a T less than or equal to the parent span can also result in partial traces. The parent span might have been `dropped` but it is possible that its child, the current span, may be `kept` because of the higher sampling rate. At the same time, in case where the parent span is `kept`, the child span would be `kept` as well (meeting our consistent sampling goals) since the child's sampling rate is greater than the parent's sampling rate. For the output TraceState, @@ -169,7 +179,7 @@ This proposal is the result of long negotiations on the Sampling SIG over what i ## Prior art and alternatives -The existing specification for `r-value` and `p-value` attempted to solve this problem, but were limited to powers of 2, which is inadequate. +The existing specification for `r-value` and `p-value` attempted to solve this problem, but was limited to powers of 2, which is inadequate. ## Open questions @@ -180,5 +190,5 @@ We also know that some implementations prefer to use a sampling probability (in ## Future possibilities This permits sampling systems to propagate consistent sampling information downstream where it can be compensated for. -For example, this will enable the tail-sampling processor in the OTel Collector to propagate its sampling decisions to backends in a standard way. +For example, this will enable the tail-sampling processor in the OTel Collector to propagate its sampling decisions to backend systems in a standard way. This permits backend systems to use the effective sampling probability in data presentations. From e1fa42fd786a998f5b8cee23a98a11865e208ee0 Mon Sep 17 00:00:00 2001 From: "J. Kalyana Sundaram" Date: Tue, 4 Jun 2024 22:36:24 -0700 Subject: [PATCH 2/8] Editorial changes --- .../0235-sampling-threshold-in-trace-state.md | 32 +++++++++---------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/text/trace/0235-sampling-threshold-in-trace-state.md b/text/trace/0235-sampling-threshold-in-trace-state.md index 776d849c7..66dbad122 100644 --- a/text/trace/0235-sampling-threshold-in-trace-state.md +++ b/text/trace/0235-sampling-threshold-in-trace-state.md @@ -6,15 +6,15 @@ Sampling is an important lever to reduce the costs associated with collecting an There are two key aspects for sampling of tracing data. The first is that sampling decisions can be made independently for *each* span in a trace. The second is that sampling decisions can be made at multiple points in the telemetry pipeline. For example, the sampling decision for a span at span creation time could have been to **keep** that span, while the downstream sampling decision for the *same* span at a later stage (say in an external process in the data collection pipeline) could be to **drop** it. -For both the above aspects, we want sampling decisions to be made in a **consistent** manner so that we can effectively reason about a trace. This OTEP describes a mechanism to achieve such consistent sampling decisions using a mechanism called **Consistent Probability Sampling**. As part of this, it proposes a mechanism for a common random value (R) and a rejection threshold (T). R is used to achieve consistent decisions, while T corresponds to a participant's sampling rate. This proposal describes how these values should be propagated and how participants should use them to make sampling decisions. +For each of the above aspects, we want sampling decisions to be made in a **consistent** manner so that we can effectively reason about a trace. This OTEP describes a mechanism to achieve such consistent sampling decisions using a mechanism called **Consistent Probability Sampling**. To achieve this, it proposes a mechanism for a common random value (R) and a rejection threshold (T) that is based on a participant's sampling rate. This proposal describes how these values should be propagated and how participants should use them to make sampling decisions. -This mechanism will enable creating a new set of samplers (Consistent Probability Samplers) that will enable trace participants to choose their own sampling rates, while still achieving consistent sampling decisions. This OTEP ensures that such samplers will interoperate with existing (non consistent probability) samplers. +This mechanism will enable creating a new set of samplers (known as Consistent Probability Samplers) that will enable trace participants to choose their own sampling rates, while still achieving consistent sampling decisions. This OTEP ensures that such samplers will interoperate with existing (non consistent probability) samplers. ## Motivation -Customers want to express arbitrary sampling probabilities such as 1%, 10%, and 75%. However, the existing experimental [specification for probability sampling using TraceState](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md) optimizes for powers of two probabilities. It supports non power of two sampling only using interpolation between powers of two. This approach is unnecessarily restrictive. Hence, we need an updated mechanism to support this capability. +Customers want to express arbitrary sampling probabilities such as 1%, 10%, and 75%. However, the existing experimental [specification for probability sampling using TraceState](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md) optimizes for powers of two probabilities. It supports non power of two sampling only using interpolation between powers of two. This approach is unnecessarily restrictive. Hence, we need an updated mechanism to support specifying any sampling probability. -Further, there is a need for consistent sampling in the collection path (outside of the head-based sampling paths). To achieve consistent sampling decisions, the previous experimental spec required using a custom source of randomness (`r-value`). However, in such downstream sampling decisions, it can be expensive to reference this custom value from the tracestate attribute in every span. To improve this, this proposal makes use of the inherent randomness in the traceID as a less-expensive solution to achieve the same result. However, the new randomness flag introduced in the W3C TraceContext Level 2 specification can potentially be reset by trace participants on the Level 1 W3C TraceContext specification. Hence, there is need to still reference tracestate to check for the non existence of this custom random value before relying on traceid as the source of randomness. +Further, there is a need for consistent sampling in the collection path (outside of the head-based sampling paths). To achieve consistent sampling decisions, the previous experimental spec required using a custom source of randomness (`r-value`). However, in such downstream sampling decisions, it can be expensive to reference this custom value from the tracestate attribute in every span. To improve this, this proposal makes use of the inherent randomness in the traceID as a less expensive solution. However, one caveat is that the new randomness flag introduced in the W3C TraceContext Level 2 specification can potentially be reset by trace participants until they move to that Level 2 specification. Hence, there is need to still reference tracestate to check for the non-existence of this custom random value before relying on the traceid as the source of randomness. ## Explanation Let's start with the definition for a consistent sampling decision. Consistency means that a positive sampling decision made for a particular span with probability p1 implies a positive sampling decision for any span belonging to the same trace if it is made with probability p2 >= p1. @@ -23,7 +23,7 @@ This proposal introduces a new value with the key `th` as an alternative to the This proposal allows for the continued expression of randomness using `r-value` as specified there using the key `r`. To distinguish the cases, this proposal uses the key `rv`. -In the general case, in order to make consistent sampling decisions across the entire path of the trace, two values MUST be present in the `SpanContext`: +In the general case, in order to make consistent sampling decisions for the two aspects described above, two values MUST be present in the `SpanContext`: 1. A _random_ (or pseudo-random) 56-bit value, called `R` below. 2. A 56-bit _rejection threshold_ (or just "threshold") as expressed in the TraceState, called `T` below. `T` represents the maximum threshold that was applied in all previous consistent sampling stages. If the current sampling stage applies a greater threshold value than any stage before, it MUST update (increase) the threshold correspondingly. @@ -40,15 +40,13 @@ where -> indicates a parent to child relationship. `B` uses consistent probability sampling with a sampling probability of 0.5. `C` uses a parent-based sampler. -When the sampling decision for `A` is to `keep` the span, its outgoing traceparent will have the 'sampled' flag SET and the 'th' in its outgoing tracestate will be set to `0xc0_0000_0000_0000`. -When the sampling decision for `A` is to `drop` the span, its outgoing traceparent will have the 'sampled' flag UNSET but the 'th' in its outgoing tracestate will still be set to `0xc0_0000_0000_0000`. -When the sampling decision for `A` is to `keep` the span, its outgoing traceparent will have the 'sampled' flag SET and the 'th' in its outgoing tracestate will be set to `0x80_0000_0000_0000`. +When the sampling decision for `A` is to *keep* the span, its outgoing traceparent will have the 'sampled' flag SET and the 'th' in its outgoing tracestate will be set to `0xc0_0000_0000_0000`. +When the sampling decision for `A` is to *drop* the span, its outgoing traceparent will have the 'sampled' flag UNSET but the 'th' in its outgoing tracestate will still be set to `0xc0_0000_0000_0000`. +When the sampling decision for `B` is to *keep* the span, its outgoing traceparent will have the 'sampled' flag SET and the 'th' in its outgoing tracestate will be set to `0x80_0000_0000_0000`. C (being a parent based sampler) samples a span purely based on its parent (B in this case), it will use the sampled flag to make the decision. Its outgoing 'th' value will continue to reflect what it got from B (`0x80_0000_0000_0000`), and this is useful to understand its adjusted count. This design requires that as a given span progresses along its collection path, `th` is non-decreasing (and, in particular, must be increased at stages that apply lower sampling probabilities). -It does not, however, restrict a span's initial `th` in any way. If a parent based consistent sampler is used, a span's initial `th` would be the same as its parent's `th` value, else it would be a new value based on the chosen sampling rate. - -It is acceptable for B to have a lesser initial `th` than A has. It would not be ok if some later-stage sampler decreased A's `th`. +It does not, however, restrict a span's initial `th` in any way. If a parent-based consistent sampler is used, a span's initial `th` would be the same as its parent's `th` value, else it would be a new value based on the sampling rate chosen for that span. In other words, the sampling rate for each operation can be chosen independently, and this would map to having different `th` values for different spans. But for any particular span, it is not acceptable for a downstream sampler to *decrease* the `th` value in its context. The system has the following invariant: @@ -59,7 +57,7 @@ The sampling decision is propagated with the following algorithm: * If the `th` key is not specified, this implies that non-probabilistic sampling may be taking place. * Else derive `T` by parsing the `th` key as a hex value as described below. * If `T` is 0, Always Sample. -* Compare the 56 bits of `T` with the 56 bits of `R`. If `T > R`, then set the sampling decision to `drop`. +* Compare the 56 bits of `T` with the 56 bits of `R`. If `R >= T`, then set the sampling decision to *keep* else make the decision to *drop*. The `R` value MUST be derived as follows: @@ -68,7 +66,7 @@ The `R` value MUST be derived as follows: At the root span, the `R` value must be generated as follows: -* If the new random flag in the `traceparent` is set, then there is no action required. In this case, the tracestate header will not have the `rv` key. For more info on this new flag, see [the W3C trace context specification](https://w3c.github.io/trace-context/#trace-id). +* If the new random flag in the `traceparent` is set, then there is no action required. In this case, the tracestate header will not have the `rv` key, and the last 56 bits of the traceid will be used as the source of randomness. For more info on this new flag, see [the W3C trace context specification](https://w3c.github.io/trace-context/#trace-id). * If not, `R` MUST be generated as a random value in the range `[0, (2**56)-1]` and added to the Tracestate header with key `rv`. Although less common, there are circumstances where trace-id randomness is inadequate (for example, when sampling a group of traces together); in these cases, an `rv` value is required. @@ -78,10 +76,10 @@ The value of the `rv` and `th` keys MUST be expressed as up to 14 hexadecimal di Examples: - `th` value is missing: non-probabalistic sampling may be taking place. -- `th=4` -- equivalent to `th=40000000000000`, which is a 25% rejection threshold, corresponding to a 75% sampling probability. -- `th=c` -- equivalent to `th=c0000000000000`, which is a rejection threshold of 75%, corresponding to a sampling probability of 25%. -- `th=08` -- equivalent to `th=08000000000000`, which is a rejection threshold of 3.125%, corresponding to a sampling probability of 96.875%. -- `th=0` -- equivalent to `th=00000000000000`, which is a 0% rejection threshold, which means Always Sample. +- `th=0` -- equivalent to `th=00000000000000`, which is a 0% rejection threshold, corresponding to 100% sampling probability (Always Sample). +- `th=08` -- equivalent to `th=08000000000000`, which is a rejection threshold of 3.125%, corresponding to 96.875% sampling probability. +- `th=4` -- equivalent to `th=40000000000000`, which is a 25% rejection threshold, corresponding to 75% sampling probability. +- `th=c` -- equivalent to `th=c0000000000000`, which is a rejection threshold of 75%, corresponding to 25% sampling probability. The `T` value MUST be derived as follows: From 5f50a412ecfe191da1653a56a123b103337628b3 Mon Sep 17 00:00:00 2001 From: "J. Kalyana Sundaram" Date: Wed, 5 Jun 2024 09:07:12 -0700 Subject: [PATCH 3/8] Added definitions section. --- .../0235-sampling-threshold-in-trace-state.md | 135 +++++++++++------- 1 file changed, 86 insertions(+), 49 deletions(-) diff --git a/text/trace/0235-sampling-threshold-in-trace-state.md b/text/trace/0235-sampling-threshold-in-trace-state.md index 66dbad122..4deda7656 100644 --- a/text/trace/0235-sampling-threshold-in-trace-state.md +++ b/text/trace/0235-sampling-threshold-in-trace-state.md @@ -1,4 +1,4 @@ -# Sampling Threshold Propagation in TraceState +# Consistent Probability Sampling with Support for Arbitrary Sampling Rates ## Abstract @@ -6,80 +6,93 @@ Sampling is an important lever to reduce the costs associated with collecting an There are two key aspects for sampling of tracing data. The first is that sampling decisions can be made independently for *each* span in a trace. The second is that sampling decisions can be made at multiple points in the telemetry pipeline. For example, the sampling decision for a span at span creation time could have been to **keep** that span, while the downstream sampling decision for the *same* span at a later stage (say in an external process in the data collection pipeline) could be to **drop** it. -For each of the above aspects, we want sampling decisions to be made in a **consistent** manner so that we can effectively reason about a trace. This OTEP describes a mechanism to achieve such consistent sampling decisions using a mechanism called **Consistent Probability Sampling**. To achieve this, it proposes a mechanism for a common random value (R) and a rejection threshold (T) that is based on a participant's sampling rate. This proposal describes how these values should be propagated and how participants should use them to make sampling decisions. +For each of the above aspects, if we don't make *consistent* sampling decisions, we will end up with traces that are unusable and don't contain a coherent set of spans because of the completely independent sampling decisions. Instead, we want sampling decisions to be made in a **consistent** manner so that we can effectively reason about a trace. This OTEP describes a mechanism to achieve such consistent sampling decisions using a mechanism called **Consistent Probability Sampling**. To achieve this, it proposes two key building blocks. The first is a common source of randomness (R) that is available to all participants. This can either be a custom value (called `rv`) or can be based on the last 7 bytes of the traceid. The second is a concept of a rejection threshold (T). This is derived directly from a participant's sampling rate. This proposal describes how these two values should be propagated and how participants should use them to make sampling decisions. This mechanism will enable creating a new set of samplers (known as Consistent Probability Samplers) that will enable trace participants to choose their own sampling rates, while still achieving consistent sampling decisions. This OTEP ensures that such samplers will interoperate with existing (non consistent probability) samplers. ## Motivation -Customers want to express arbitrary sampling probabilities such as 1%, 10%, and 75%. However, the existing experimental [specification for probability sampling using TraceState](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md) optimizes for powers of two probabilities. It supports non power of two sampling only using interpolation between powers of two. This approach is unnecessarily restrictive. Hence, we need an updated mechanism to support specifying any sampling probability. +Customers want to express arbitrary sampling probabilities such as 1%, 10%, and 75%. However, the existing experimental [specification for probability sampling using TraceState](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md) optimizes for powers of two probabilities. It supports non power of two sampling only using interpolation between powers of two. Since this approach is unnecessarily restrictive, we need an updated mechanism to support specifying any sampling probability. -Further, there is a need for consistent sampling in the collection path (outside of the head-based sampling paths). To achieve consistent sampling decisions, the previous experimental spec required using a custom source of randomness (`r-value`). However, in such downstream sampling decisions, it can be expensive to reference this custom value from the tracestate attribute in every span. To improve this, this proposal makes use of the inherent randomness in the traceID as a less expensive solution. However, one caveat is that the new randomness flag introduced in the W3C TraceContext Level 2 specification can potentially be reset by trace participants until they move to that Level 2 specification. Hence, there is need to still reference tracestate to check for the non-existence of this custom random value before relying on the traceid as the source of randomness. +Further, there is a need for consistent sampling in the collection path, outside of the head-based sampling done at span creation time. To achieve consistent sampling decisions, the previous experimental spec always required using a custom source of randomness (`r-value`). However, in such downstream sampling decisions, it can be expensive to reference this custom value from the tracestate attribute in every span. To improve it, this proposal makes use of the inherent randomness in the traceID as a less expensive solution. However, since we continue to support overriding this using a custom random value, there is still a need to reference `tracestate` to check if a custom random value is present before using traceid as the source of randomness. -## Explanation -Let's start with the definition for a consistent sampling decision. Consistency means that a positive sampling decision made for a particular span with probability p1 implies a positive sampling decision for any span belonging to the same trace if it is made with probability p2 >= p1. +TODO: Since we support custom `rv` values, if downstream samplers have to anyway parse the tracestate to see if `rv` is present, then is it really less expensive as compared to always using the `rv` from tracestate? -This proposal introduces a new value with the key `th` as an alternative to the `p` value in the previous specification. The `p` value is limited to powers of two, while the `th` value in this proposal supports a large range of values. +## Definitions -This proposal allows for the continued expression of randomness using `r-value` as specified there using the key `r`. To distinguish the cases, this proposal uses the key `rv`. +Let's start with the definitions for the key building blocks of this proposal. -In the general case, in order to make consistent sampling decisions for the two aspects described above, two values MUST be present in the `SpanContext`: +### Sampling Probability -1. A _random_ (or pseudo-random) 56-bit value, called `R` below. -2. A 56-bit _rejection threshold_ (or just "threshold") as expressed in the TraceState, called `T` below. `T` represents the maximum threshold that was applied in all previous consistent sampling stages. If the current sampling stage applies a greater threshold value than any stage before, it MUST update (increase) the threshold correspondingly. +This decides the percentage of spans that will be *kept*. Each participant can choose a different sampling probability. For example, if the sampling probability is 0.25, around 25% of the spans will be kept. -One way to think about _rejection threshold_ is that it is the number of spans that would be discarded out of 2^56 considered spans. This means that spans where `R >= T` will be kept. +### Consistent Sampling Decision -Here is an example involving three participating operations `A`, `B`, and `C`: +A consistent sampling decision means that a positive sampling decision made for a particular span with probability p1 necessarily implies a positive sampling decision for any span belonging to the same trace if it is made with probability p2 >= p1. -`A` -> `B` -> `C` +### Rejection Threshold -where -> indicates a parent to child relationship. +This is a 56-bit value directly derived from the sampling probability. One way to think about this is that this is the number of spans that would be *dropped* out of 2^56 considered spans. This is an alternative to the `p` value in the previous specification. The `p` value is limited to powers of two, while this supports a large range of values. -`A` uses consistent probability sampling with a sampling probability of 0.25 (this corresponds to a rejection probability of .75). -`B` uses consistent probability sampling with a sampling probability of 0.5. -`C` uses a parent-based sampler. +You can derive the rejection threshold from the sampling probability as follows: -When the sampling decision for `A` is to *keep* the span, its outgoing traceparent will have the 'sampled' flag SET and the 'th' in its outgoing tracestate will be set to `0xc0_0000_0000_0000`. -When the sampling decision for `A` is to *drop* the span, its outgoing traceparent will have the 'sampled' flag UNSET but the 'th' in its outgoing tracestate will still be set to `0xc0_0000_0000_0000`. -When the sampling decision for `B` is to *keep* the span, its outgoing traceparent will have the 'sampled' flag SET and the 'th' in its outgoing tracestate will be set to `0x80_0000_0000_0000`. -C (being a parent based sampler) samples a span purely based on its parent (B in this case), it will use the sampled flag to make the decision. Its outgoing 'th' value will continue to reflect what it got from B (`0x80_0000_0000_0000`), and this is useful to understand its adjusted count. +Rejection Threshold = (1 - sampling probability) * 2^56. -This design requires that as a given span progresses along its collection path, `th` is non-decreasing (and, in particular, must be increased at stages that apply lower sampling probabilities). -It does not, however, restrict a span's initial `th` in any way. If a parent-based consistent sampler is used, a span's initial `th` would be the same as its parent's `th` value, else it would be a new value based on the sampling rate chosen for that span. In other words, the sampling rate for each operation can be chosen independently, and this would map to having different `th` values for different spans. But for any particular span, it is not acceptable for a downstream sampler to *decrease* the `th` value in its context. +For example, if the sampling probability is 100% (keep all spans), the rejection threshold would be 0. -The system has the following invariant: +Similarly, if the sampling probability is 0% (drop all spans), the rejection threshold would be (1-0) * 2^56 = 72057594037927936 = 0x100000000000000. -`(R >= T) = sampled flag` +We refer to this rejection threshold conceptually as `T`. We represent it using the key `th`. This must be propagated in both the `tracestate` header and in the TraceState attribute of each span. -The sampling decision is propagated with the following algorithm: +### Random Value (R) -* If the `th` key is not specified, this implies that non-probabilistic sampling may be taking place. -* Else derive `T` by parsing the `th` key as a hex value as described below. -* If `T` is 0, Always Sample. -* Compare the 56 bits of `T` with the 56 bits of `R`. If `R >= T`, then set the sampling decision to *keep* else make the decision to *drop*. +A common random value (that is known or propagated to all participants) is the main ingredient that enables consistent probability sampling. Each participant can compare this value (R) with their rejection threshold (T) to make a consistent sampling decision across an entire trace (or even across a group of traces). -The `R` value MUST be derived as follows: +This proposal supports two sources of randomness: -* If the key `rv` is present in the Tracestate header, then `R = rv`. -* Else `R` is the lowest-order 56 bits of the trace-id. +- **A custom source of randomness**: This proposal allows for a _random_ (or pseudo-random) 56-bit value. We refer to this as `rv` (this was previously known as `r-value`). This can be generated and propagated through the `tracestate` header and the tracestate attribute in each span. +- **Using TraceID as a source of randomness**: This proposal introduces using the last 56 bits of the `traceid` as the source of randomness. This can be done if the root participant knows that the `traceid` has been generated in a random or pseudo-random manner. -At the root span, the `R` value must be generated as follows: +### Consistent Sampling Decision Approach +Given the above building blocks, let's look at how a participant can make consistent sampling decisions. For this, two values MUST be present in the `SpanContext`: + +1. The common source of randomness: the 56-bit `R` value. +2. The rejection threshold: the 56-bit `T` value. + +If `R` >= `T`, *keep* the span, else *drop* the span. + +`T` represents the maximum threshold that was applied in all previous consistent sampling stages. If the current sampling stage applies a greater threshold value than any stage before, it MUST update (increase) the threshold correspondingly. + +## Explanation + +### Example + +Here is an example involving three participating operations `A`, `B`, and `C`: + +`A` -> `B` -> `C` + +where -> indicates a parent to child relationship. -* If the new random flag in the `traceparent` is set, then there is no action required. In this case, the tracestate header will not have the `rv` key, and the last 56 bits of the traceid will be used as the source of randomness. For more info on this new flag, see [the W3C trace context specification](https://w3c.github.io/trace-context/#trace-id). -* If not, `R` MUST be generated as a random value in the range `[0, (2**56)-1]` and added to the Tracestate header with key `rv`. +- `A` uses consistent probability sampling with a sampling probability of 0.25. This corresponds to a rejection threshold of (1-0.25)*2^56 = 0xc0000000000000. +- `B` uses consistent probability sampling with a sampling probability of 0.5. This corresponds to a rejection threshold of (1-0.5)*2^56 = 0x80000000000000. +- `C` uses a parent-based sampler. This means that C indirectly follows the same rejection threshold as its parent B. -Although less common, there are circumstances where trace-id randomness is inadequate (for example, when sampling a group of traces together); in these cases, an `rv` value is required. +When the sampling decision for `A` is to *keep* the span, its outgoing traceparent will have the `sampled` flag SET and the `th` in its outgoing tracestate will be set to `0xc0_0000_0000_0000`. +When the sampling decision for `A` is to *drop* the span, its outgoing traceparent will have the `sampled` flag UNSET but the `th` in its outgoing tracestate will still be set to `0xc0_0000_0000_0000`. +When the sampling decision for `B` is to *keep* the span, its outgoing traceparent will have the `sampled` flag SET and the `th` in its outgoing tracestate will be set to `0x80_0000_0000_0000`. +C (being a parent based sampler) samples a span purely based on its parent (B in this case), it will use the sampled flag to make the decision. Its outgoing `th` value will continue to reflect what it got from B (`0x80_0000_0000_0000`), and this is useful to understand its adjusted count. -The value of the `rv` and `th` keys MUST be expressed as up to 14 hexadecimal digits from the set `[0-9a-f]`. For `th` keys only, trailing zeros (but not leading zeros) may be omitted. `rv` keys MUST always be exactly 14 hex digits. +### Rules for `th` creation and propagation + +The value of the `th` key MUST be expressed as up to 14 hexadecimal digits from the set `[0-9a-f]`. Trailing zeros (but not leading zeros) may be omitted for `th` values. Examples: - `th` value is missing: non-probabalistic sampling may be taking place. - `th=0` -- equivalent to `th=00000000000000`, which is a 0% rejection threshold, corresponding to 100% sampling probability (Always Sample). -- `th=08` -- equivalent to `th=08000000000000`, which is a rejection threshold of 3.125%, corresponding to 96.875% sampling probability. +- `th=08` -- equivalent to `th=08000000000000`, which is a 3.125% rejection threshold, corresponding to 96.875% sampling probability. - `th=4` -- equivalent to `th=40000000000000`, which is a 25% rejection threshold, corresponding to 75% sampling probability. -- `th=c` -- equivalent to `th=c0000000000000`, which is a rejection threshold of 75%, corresponding to 25% sampling probability. +- `th=c` -- equivalent to `th=c0000000000000`, which is a 75% rejection threshold, corresponding to 25% sampling probability. The `T` value MUST be derived as follows: @@ -88,6 +101,35 @@ The `T` value MUST be derived as follows: Sampling Decisions MUST be propagated by setting the value of the `th` key in the Tracestate header according to the above. +This design requires that as a given span progresses along its collection path, `th` is non decreasing (and, in particular, must be increased at stages that apply lower sampling probabilities). + +It does not, however, restrict a span's initial `th` in any way. If a parent-based consistent sampler is used, a span's initial `th` would be the same as its parent's `th` value, else it would be a new value based on the sampling rate chosen for that span. In other words, the sampling rate for each operation can be chosen independently, and this would map to having different `th` values for different spans. But for any particular span, it is not acceptable for a downstream sampler to *decrease* the `th` value in its context. + +The system has the following invariant: + +`(R >= T) = sampled flag` + +The sampling decision is propagated with the following algorithm: + +* If the `th` key is not specified, this implies that non-probabilistic sampling may be taking place. +* Else derive `T` by parsing the `th` key as a hex value as described below. +* If `T` is 0, Always Sample. +* Compare the 56 bits of `T` with the 56 bits of `R`. If `R >= T`, then set the sampling decision to *keep* else make the decision to *drop*. + +### Rules for R value creation and propagation + +At the root span, the `R` value must be generated as follows: + +* If the new random flag in the `traceparent` is set, there is no need to set the `rv` key in the tracestate header, and the last 56 bits of the traceid can be used as the source of randomness. For more info on this new flag, see [the W3C trace context specification](https://w3c.github.io/trace-context/#trace-id). +* If not, `R` MUST be generated as a random value in the range `[0, (2**56)-1]` and added to the Tracestate header with key `rv`. The value of the `rv` key MUST be expressed as up to 14 hexadecimal digits from the set `[0-9a-f]`. `rv` keys MUST always be exactly 14 hex digits. + +Although less common, there are circumstances where trace-id randomness is not sufficient. One example of this is when you want to achieve consistent sampling across a group of traces in a session. For such cases, a common `rv` value must be generated and added to the tracestate header. + +In all other places, the `R` value MUST be derived as follows: + +* If the key `rv` is present in the Tracestate header, then `R = rv`. +* Else `R` is the lowest-order 56 bits of the trace-id. + ## Initializing and updating T and R values There are two categories of samplers: @@ -103,8 +145,8 @@ A head sampler is responsible for computing the `rv` and `th` values in a new sp First, a consistent probability `Sampler` may choose its own sampling rate. The higher the chosen sampling rate, the lower the rejection threshold (T). It MAY select any value of T. If a valid `SpanContext` is provided in the call to `ShouldSample` (indicating that the span being created will be a child span), -- Choosing a T greater than the parent span's T can result in partial traces. The parent span may be `kept` but it is possible that its child, the current span, may be dropped because of the lower sampling rate. At the same time, in case where the child span is `kept`, the parent span would have been to `keep` as well (meeting our consistent sampling goals) since the parent's sampling rate is greater than the child's sampling rate. -- Similarly, choosing a T less than or equal to the parent span can also result in partial traces. The parent span might have been `dropped` but it is possible that its child, the current span, may be `kept` because of the higher sampling rate. At the same time, in case where the parent span is `kept`, the child span would be `kept` as well (meeting our consistent sampling goals) since the child's sampling rate is greater than the parent's sampling rate. +- Choosing a T greater than the parent span's T can result in partial traces. The parent span may be *kept* but it is possible that its child, the current span, may be dropped because of the lower sampling rate. At the same time, in case where the child span is *kept*, the parent span would have been to *keep* as well (meeting our consistent sampling goals) since the parent's sampling rate is greater than the child's sampling rate. +- Similarly, choosing a T less than or equal to the parent span can also result in partial traces. The parent span might have been *dropped* but it is possible that its child, the current span, may be *kept* because of the higher sampling rate. At the same time, in case where the parent span is *kept*, the child span would be *kept* as well (meeting our consistent sampling goals) since the child's sampling rate is greater than the parent's sampling rate. For the output TraceState, @@ -175,10 +217,6 @@ The adjusted count indicates the approximate quantity of items from the populati This proposal is the result of long negotiations on the Sampling SIG over what is required and various alternative forms of expressing it. [This issue](https://github.com/open-telemetry/opentelemetry-specification/issues/3602) exhaustively covers the various formats that were discussed and their pros and cons. This proposal is the result of that decision. -## Prior art and alternatives - -The existing specification for `r-value` and `p-value` attempted to solve this problem, but was limited to powers of 2, which is inadequate. - ## Open questions This specification leaves room for different implementation options. For example, comparing hex strings or converting them to numeric format are both viable alternatives for handling the threshold. @@ -188,5 +226,4 @@ We also know that some implementations prefer to use a sampling probability (in ## Future possibilities This permits sampling systems to propagate consistent sampling information downstream where it can be compensated for. -For example, this will enable the tail-sampling processor in the OTel Collector to propagate its sampling decisions to backend systems in a standard way. -This permits backend systems to use the effective sampling probability in data presentations. +For example, this will enable the tail-sampling processor in the OTel Collector to propagate its sampling decisions to backend systems in a standard way. This permits backend systems to use the effective sampling probability in data presentations. From d9f65766397fcd09fb2bc99d27915e51c5b35dd3 Mon Sep 17 00:00:00 2001 From: "J. Kalyana Sundaram" Date: Wed, 5 Jun 2024 10:51:37 -0700 Subject: [PATCH 4/8] Updated wording to improve clarity --- .../0235-sampling-threshold-in-trace-state.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/text/trace/0235-sampling-threshold-in-trace-state.md b/text/trace/0235-sampling-threshold-in-trace-state.md index 4deda7656..9c92e5873 100644 --- a/text/trace/0235-sampling-threshold-in-trace-state.md +++ b/text/trace/0235-sampling-threshold-in-trace-state.md @@ -130,23 +130,25 @@ In all other places, the `R` value MUST be derived as follows: * If the key `rv` is present in the Tracestate header, then `R = rv`. * Else `R` is the lowest-order 56 bits of the trace-id. -## Initializing and updating T and R values +## Sampler behavior for initializing and updating T and R values There are two categories of samplers: - **Head samplers:** Implementations of [`Sampler`](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.29.0/specification/trace/sdk.md#sampler), called by a `Tracer` during span creation. -- **Downstream samplers:** Any component that, given an ended Span, decides whether to *drop* or *keep* it by forwarding it to the next component in the system. This category is also known as "collection path samplers" or "sampling processors". _Tail samplers_ are a special class of downstream samplers that buffer spans of a trace and make a sampling decision for the trace as a whole using data from any span in the buffered trace. +- **Downstream samplers:** Any component that, given an ended Span, decides whether to *drop* it or *keep* it (by forwarding it to the next component in the pipeline). This category is also known as "collection path samplers" or "sampling processors". Note that _Tail samplers_ are a special class of downstream samplers that buffer spans of a trace and make a sampling decision for the trace as a whole using data from any span in the buffered trace. -This section defines behavior for each kind of sampler. +This section defines the behavior for these two categories of samplers. ### Head samplers A head sampler is responsible for computing the `rv` and `th` values in a new span's initial [`TraceState`](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.29.0/specification/trace/api.md#tracestate). The main inputs to that computation include the parent span's trace state (if a parent span exists), the new span's trace ID, and possibly the trace flags (to know if the trace ID has been generated in a random manner). -First, a consistent probability `Sampler` may choose its own sampling rate. The higher the chosen sampling rate, the lower the rejection threshold (T). It MAY select any value of T. If a valid `SpanContext` is provided in the call to `ShouldSample` (indicating that the span being created will be a child span), +First, a consistent probability `Sampler` may choose its own sampling rate. The higher the chosen sampling rate, the lower the rejection threshold (T). It MAY select any value of T. If a valid `SpanContext` is provided in the call to `ShouldSample` (indicating that the span being created will be a child span), there are two possibilities: -- Choosing a T greater than the parent span's T can result in partial traces. The parent span may be *kept* but it is possible that its child, the current span, may be dropped because of the lower sampling rate. At the same time, in case where the child span is *kept*, the parent span would have been to *keep* as well (meeting our consistent sampling goals) since the parent's sampling rate is greater than the child's sampling rate. -- Similarly, choosing a T less than or equal to the parent span can also result in partial traces. The parent span might have been *dropped* but it is possible that its child, the current span, may be *kept* because of the higher sampling rate. At the same time, in case where the parent span is *kept*, the child span would be *kept* as well (meeting our consistent sampling goals) since the child's sampling rate is greater than the parent's sampling rate. +- **The child span chooses a T greater than the parent span's T**: The parent span may be *kept* but it is possible that its child, the current span, may be dropped because of the lower sampling rate. At the same time, in the case where the decision for the child span is to *keep* it, the decision for the parent span would have also been to *keep* (due to our consistent sampling approach) since the parent's sampling rate is greater than the child's sampling rate. +- **The child span chooses a T less than or equal to the parent span's T**: The parent span might have been *dropped* but it is possible that its child, the current span, may be *kept* because of the higher sampling rate. At the same time, in case where the parent span is *kept*, the child span would be *kept* as well (due to our consistent sampling approach) since the child's sampling rate is greater than the parent's sampling rate. + +Note that while both the above cases can result in partial traces, they still meet the consistent sampling goals. For the output TraceState, From b366309390229129c4172a61e46730ba1cf355e7 Mon Sep 17 00:00:00 2001 From: "J. Kalyana Sundaram" Date: Wed, 19 Jun 2024 20:10:53 -0700 Subject: [PATCH 5/8] Simplified wording in the motivation section. --- text/trace/0235-sampling-threshold-in-trace-state.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/text/trace/0235-sampling-threshold-in-trace-state.md b/text/trace/0235-sampling-threshold-in-trace-state.md index 9c92e5873..175dc6c1d 100644 --- a/text/trace/0235-sampling-threshold-in-trace-state.md +++ b/text/trace/0235-sampling-threshold-in-trace-state.md @@ -12,11 +12,9 @@ This mechanism will enable creating a new set of samplers (known as Consistent P ## Motivation -Customers want to express arbitrary sampling probabilities such as 1%, 10%, and 75%. However, the existing experimental [specification for probability sampling using TraceState](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md) optimizes for powers of two probabilities. It supports non power of two sampling only using interpolation between powers of two. Since this approach is unnecessarily restrictive, we need an updated mechanism to support specifying any sampling probability. +Customers want to express arbitrary sampling probabilities such as 1%, 10%, and 75%. However, the previous experimental [specification for probability sampling using TraceState](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md) optimizes for powers of two probabilities. It supports non power of two sampling only using interpolation between powers of two. Since this approach is unnecessarily restrictive, we need an updated mechanism to support specifying any sampling probability. -Further, there is a need for consistent sampling in the collection path, outside of the head-based sampling done at span creation time. To achieve consistent sampling decisions, the previous experimental spec always required using a custom source of randomness (`r-value`). However, in such downstream sampling decisions, it can be expensive to reference this custom value from the tracestate attribute in every span. To improve it, this proposal makes use of the inherent randomness in the traceID as a less expensive solution. However, since we continue to support overriding this using a custom random value, there is still a need to reference `tracestate` to check if a custom random value is present before using traceid as the source of randomness. - -TODO: Since we support custom `rv` values, if downstream samplers have to anyway parse the tracestate to see if `rv` is present, then is it really less expensive as compared to always using the `rv` from tracestate? +Further, to achieve consistent sampling decisions, the previous experimental spec always required using a custom source of randomness (`r-value`). However, there is a performance overhead to propagate this value in the tracestate header and the tracestate span attribute. To address this, this OTEP proposes using the inherent randomness in the traceID to have a more performant approach. At the same time, it continues to support overriding that mechanism using a custom random value and hence there is still a need to reference `tracestate` to check if a custom random value is present before using the traceid as the source of randomness. ## Definitions From 735cf5e756a28f5442e9c323e0596affda39b849 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Fri, 26 Jul 2024 15:18:15 -0700 Subject: [PATCH 6/8] whitespace --- text/trace/0235-sampling-threshold-in-trace-state.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/trace/0235-sampling-threshold-in-trace-state.md b/text/trace/0235-sampling-threshold-in-trace-state.md index 175dc6c1d..d68b00a5b 100644 --- a/text/trace/0235-sampling-threshold-in-trace-state.md +++ b/text/trace/0235-sampling-threshold-in-trace-state.md @@ -52,6 +52,7 @@ This proposal supports two sources of randomness: - **Using TraceID as a source of randomness**: This proposal introduces using the last 56 bits of the `traceid` as the source of randomness. This can be done if the root participant knows that the `traceid` has been generated in a random or pseudo-random manner. ### Consistent Sampling Decision Approach + Given the above building blocks, let's look at how a participant can make consistent sampling decisions. For this, two values MUST be present in the `SpanContext`: 1. The common source of randomness: the 56-bit `R` value. From 0b9e6773f49ee82a22784ec7533c109e56ad158c Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Fri, 26 Jul 2024 15:23:38 -0700 Subject: [PATCH 7/8] top-level note about the edit --- text/trace/0235-sampling-threshold-in-trace-state.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/trace/0235-sampling-threshold-in-trace-state.md b/text/trace/0235-sampling-threshold-in-trace-state.md index d68b00a5b..863e1293e 100644 --- a/text/trace/0235-sampling-threshold-in-trace-state.md +++ b/text/trace/0235-sampling-threshold-in-trace-state.md @@ -1,5 +1,7 @@ # Consistent Probability Sampling with Support for Arbitrary Sampling Rates +**Note this document was revised in [OTEP 261](https://github.com/open-telemetry/oteps/pull/261)**. + ## Abstract Sampling is an important lever to reduce the costs associated with collecting and processing telemetry data. It enables you to choose a representative set of items from an overall population. From 6fd6d0c166a548eafc8d3b834b73a3de387aa69a Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 21 Oct 2024 14:45:52 -0700 Subject: [PATCH 8/8] Update text/trace/0235-sampling-threshold-in-trace-state.md Co-authored-by: Otmar Ertl --- text/trace/0235-sampling-threshold-in-trace-state.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/trace/0235-sampling-threshold-in-trace-state.md b/text/trace/0235-sampling-threshold-in-trace-state.md index 863e1293e..3d9d82eaf 100644 --- a/text/trace/0235-sampling-threshold-in-trace-state.md +++ b/text/trace/0235-sampling-threshold-in-trace-state.md @@ -79,7 +79,7 @@ where -> indicates a parent to child relationship. - `C` uses a parent-based sampler. This means that C indirectly follows the same rejection threshold as its parent B. When the sampling decision for `A` is to *keep* the span, its outgoing traceparent will have the `sampled` flag SET and the `th` in its outgoing tracestate will be set to `0xc0_0000_0000_0000`. -When the sampling decision for `A` is to *drop* the span, its outgoing traceparent will have the `sampled` flag UNSET but the `th` in its outgoing tracestate will still be set to `0xc0_0000_0000_0000`. +When the sampling decision for `A` is to *drop* the span, its outgoing traceparent will have the `sampled` flag UNSET and `th` in its outgoing tracestate will be removed. When the sampling decision for `B` is to *keep* the span, its outgoing traceparent will have the `sampled` flag SET and the `th` in its outgoing tracestate will be set to `0x80_0000_0000_0000`. C (being a parent based sampler) samples a span purely based on its parent (B in this case), it will use the sampled flag to make the decision. Its outgoing `th` value will continue to reflect what it got from B (`0x80_0000_0000_0000`), and this is useful to understand its adjusted count.