feat(spans): Extract transaction from segment span (#3375)

ref: #3278 The current state of Performance is that 1. There are SDKs that emit transactions, and there are SDKs that emit standalone spans. 2. There are parts of the product that require transactions, and there are parts of the product that require spans. We already extract spans from transactions to make span-dependent product features work for "transaction"-SDKs. Now we need to also extract transactions from spans to make transaction-dependent product features work for "span"-SDKs: ```mermaid flowchart LR SDK -->|transaction| Relay SDK -->|span| Relay Relay -->|span| spc[Span Consumer] Relay -->|transaction| txc[Transaction Consumer] Relay -->|span from transaction| spc Relay -->|transaction from span| txc linkStyle 5 color:green; ``` ## Point of conversion When to actually convert the segment span to transaction? I've listed two options below. At the moment, I believe only Option 1 is feasible because the span processing pipeline does not have all features implemented yet. ### Option 1 - At the start of envelope processing [Selected (see Option 1a)] ```mermaid flowchart TD SDK -->|span| extract extract-->|span| process_standalone_span extract-->|transaction| process_transaction ``` Pros: - No need to duplicate code for e.g. transaction metrics extraction. An extracted transaction can pass through the same processing pipeline as an "organic" transaction sent from the SDK, with the same normalization, PII scrubbing, etc. - The transaction processing pipeline is more mature and better tested than the span processing pipeline. Cons: - Some duplicate work: Normalization and PII scrubbing will run for both the original segment span and the transaction extracted from it. - Inconsistent with how we extract spans from transactions (this is done after processing). - Extraction will occur in edge Relays (not just processing Relays), so any updates to the conversion would take months to propagate to external Relays. ### Option 1a - At the start of span processing in processing Relays [Selected] Like Option 1, but only done in processing relays: Pros: - Gets rid of one of the Cons of Option 1. - Spans are currently only parsed in processing Relays. No need to refactor that. Cons: - Need to make sure that transactions are normalized, even if normalization is disabled in processing Relays. ### Option 2 - At the end of envelope processing (in processing Relays) [Discarded] ```mermaid flowchart TD process_span["process span (normalize, filter, metrics, sample)"] SDK -->|span| process_span process_span -->|span| extract_transaction extract_transaction -->|span| enforce_quotas extract_transaction -->|transaction| extract_metrics_tx extract_metrics_tx -->|transaction| enforce_quotas ``` Pros: - Assuming that span processing already filters, normalizes, samples and scrubs spans correctly, there would be no duplicate work done for the extracted transaction. All that's left would be transaction metrics extraction and rate limiting - Consistent with how we extract spans from transactions. Cons: - Cannot leverage the fully mature transaction processing pipeline. - **BLOCKING:** Inbound filters and dynamic sampling for spans are not ready yet. - Needs some duplicate code to extract transaction metrics from extracted transactions. ## Prevent duplicate data We already cross the spans/transactions in two places: 1. For every transaction, Relay extracts one standalone span for the transaction's child spans, _and_ a standalone segment span for the transaction itself. 2. For compatibility of performance scores, there is one transaction metric that is also extracted from standalone spans: `"d:transactions/measurements.score.total@ratio"`. To prevent circular conversion of data, I suggest to introduce two new item headers: 1. `"transaction_extracted"` for span items, which will be checked before converting a span to a transaction. For segment spans extracted from transactions, this flag will be `true` from the start. 2. `"spans_extracted"` for transaction items, which will be checked before extracting spans or span metrics from a transaction. For transactions extracted from spans, this flag will be `true` from the start. In addition, we will stop extracting `"d:transactions/measurements.score.total@ratio"` from spans. --------- Co-authored-by: David Herberth <david.herberth@sentry.io>
getsentry · Apr 11, 2024 · f71e136 · f71e136
1 parent 7104021
commit f71e136
Show file tree

Hide file tree

Showing 18 changed files with 518 additions and 98 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -21,6 +21,7 @@
 - Stop extracting count_per_segment and count_per_op metrics. ([#3380](https://github.com/getsentry/relay/pull/3380))
 - Add `cardinality_limited` outcome with id `6`. ([#3389](https://github.com/getsentry/relay/pull/3389))
 - Extract `cache.item_size` and `cache.hit` metrics. ([#3371]https://github.com/getsentry/relay/pull/3371)
+- Optionally convert segment spans to transactions for compatibility. ([#3375](https://github.com/getsentry/relay/pull/3375))
 
 **Internal**:
 

diff --git a/relay-dynamic-config/src/defaults.rs b/relay-dynamic-config/src/defaults.rs
@@ -58,7 +58,11 @@ pub fn add_span_metrics(project_config: &mut ProjectConfig) {
         return;
     }
 
-    config.metrics.extend(span_metrics());
+    config.metrics.extend(span_metrics(
+        project_config
+            .features
+            .has(Feature::ExtractTransactionFromSegmentSpan),
+    ));
 
     config._span_metrics_extended = true;
     if config.version == 0 {
@@ -67,7 +71,13 @@ pub fn add_span_metrics(project_config: &mut ProjectConfig) {
 }
 
 /// Metrics with tags applied as required.
-fn span_metrics() -> impl IntoIterator<Item = MetricSpec> {
+fn span_metrics(transaction_extraction_enabled: bool) -> impl IntoIterator<Item = MetricSpec> {
+    let score_total_transaction_metric = if transaction_extraction_enabled {
+        RuleCondition::never()
+    } else {
+        RuleCondition::all()
+    };
+
     let is_db = RuleCondition::eq("span.sentry_tags.category", "db")
         & !(RuleCondition::eq("span.system", "mongodb")
             | RuleCondition::glob("span.op", DISABLED_DATABASES)
@@ -471,7 +481,11 @@ fn span_metrics() -> impl IntoIterator<Item = MetricSpec> {
             mri: "d:transactions/measurements.score.total@ratio".into(),
             field: Some("span.measurements.score.total.value".into()),
             condition: Some(
-                is_allowed_browser.clone() & RuleCondition::eq("span.was_transaction", false),
+                // If transactions are extracted from spans, the transaction processing pipeline
+                // will take care of this metric.
+                score_total_transaction_metric
+                    & is_allowed_browser.clone()
+                    & RuleCondition::eq("span.was_transaction", false),
             ),
             tags: vec![
                 Tag::with_key("span.op")

diff --git a/relay-dynamic-config/src/feature.rs b/relay-dynamic-config/src/feature.rs
@@ -81,6 +81,15 @@ pub enum Feature {
     #[serde(rename = "organizations:continuous-profiling")]
     ContinuousProfiling,
 
+    /// When enabled, every standalone segment span will be duplicated as a transaction.
+    ///
+    /// This allows support of product features that rely on transactions for SDKs that only
+    /// send spans.
+    ///
+    /// Serialized as `projects:extract-transaction-from-segment-span`.
+    #[serde(rename = "projects:extract-transaction-from-segment-span")]
+    ExtractTransactionFromSegmentSpan,
+
     /// Deprecated, still forwarded for older downstream Relays.
     #[doc(hidden)]
     #[serde(rename = "organizations:transaction-name-mark-scrubbed-as-sanitized")]

diff --git a/relay-event-normalization/src/normalize/span/exclusive_time.rs b/relay-event-normalization/src/normalize/span/exclusive_time.rs
@@ -42,6 +42,11 @@ fn set_event_exclusive_time(
         return;
     };
 
+    if trace_context.exclusive_time.value().is_some() {
+        // Exclusive time already set, respect it.
+        return;
+    }
+
     let Some(span_id) = trace_context.span_id.value() else {
         return;
     };

diff --git a/relay-event-schema/src/protocol/span/convert.rs b/relay-event-schema/src/protocol/span/convert.rs
@@ -3,6 +3,7 @@ use crate::protocol::{
     ContextInner, Contexts, DefaultContext, Event, ProfileContext, Span, TraceContext,
 };
 
+use relay_base_schema::events::EventType;
 use relay_protocol::Annotated;
 use std::collections::BTreeMap;
 
@@ -47,7 +48,7 @@ macro_rules! map_fields {
             $(span.$fixed_span_field:ident <= $fixed_span_value:expr), *
         }
         fixed_for_event {
-            $($fixed_event_value:expr => span.$fixed_event_field:ident), *
+            $($fixed_event_value:expr => event.$fixed_event_field:ident), *
         }
     ) => {
         #[allow(clippy::needless_update)]
@@ -78,31 +79,48 @@ macro_rules! map_fields {
         }
 
         #[allow(clippy::needless_update)]
-        impl From<&Span> for Event {
-            fn from(span: &Span) -> Self {
-                Self {
+        impl TryFrom<&Span> for Event {
+            type Error = ();
+
+            fn try_from(span: &Span) -> Result<Self, ()> {
+                use relay_protocol::Empty;
+
+                if !span.is_segment.value().unwrap_or(&false) {
+                    // Only segment spans can become transactions.
+                    return Err(());
+                }
+                let event = Self {
                     $(
                         $event_field: span.$span_field.clone().map_value(Into::into),
                     )*
                     $(
                         $fixed_event_field: $fixed_event_value.into(),
                     )*
                     contexts: Annotated::new(
-                        Contexts(
-                            BTreeMap::from([
+                        Contexts({
+                            let mut contexts = BTreeMap::new();
+                            $(
+                                let mut context = $ContextType::default();
+                                let mut has_fields = false;
                                 $(
-                                    (<$ContextType as DefaultContext>::default_key().into(), ContextInner($ContextType {
-                                        $(
-                                            $context_field: span.$primary_span_field.clone(),
-                                        )*
-                                        ..Default::default()
-                                    }.into_context()).into()),
+                                    if !span.$primary_span_field.is_empty() {
+                                        context.$context_field = span.$primary_span_field.clone();
+                                        has_fields = true;
+                                    }
                                 )*
-                            ]),
-                        )
+                                if has_fields {
+                                    let context_key = <$ContextType as DefaultContext>::default_key().into();
+                                    contexts.insert(context_key, ContextInner(context.into_context()).into());
+                                }
+                            )*
+                            contexts
+                        })
                     ),
                     ..Default::default()
-                }
+                };
+
+
+                Ok(event)
             }
         }
     };
@@ -141,7 +159,7 @@ map_fields!(
         span.was_transaction <= true
     }
     fixed_for_event {
-        // nothing yet
+        EventType::Transaction => event.ty
     }
 );
 
@@ -155,6 +173,7 @@ mod tests {
     fn roundtrip() {
         let event = Annotated::<Event>::from_json(
             r#"{
+                "type": "transaction",
                 "contexts": {
                     "profile": {"profile_id": "a0aaaaaaaaaaaaaaaaaaaaaaaaaaaaab"},
                     "trace": {
@@ -253,7 +272,20 @@ mod tests {
         }
         "###);
 
-        let roundtripped = Event::from(&span_from_event);
+        let roundtripped = Event::try_from(&span_from_event).unwrap();
         assert_eq!(event, roundtripped);
     }
+
+    #[test]
+    fn no_empty_profile_context() {
+        let span = Span {
+            is_segment: true.into(),
+            ..Default::default()
+        };
+        let event = Event::try_from(&span).unwrap();
+
+        // No profile context is set.
+        // profile_id is required on ProfileContext so we should not create an empty one.
+        assert!(event.context::<ProfileContext>().is_none());
+    }
 }
diff --git a/relay-sampling/src/evaluation.rs b/relay-sampling/src/evaluation.rs
@@ -65,6 +65,11 @@ impl<'a> ReservoirEvaluator<'a> {
         }
     }
 
+    /// Gets shared ownership of the reservoir counters.
+    pub fn counters(&self) -> ReservoirCounters {
+        Arc::clone(&self.counters)
+    }
+
     /// Sets the Redis pool and organiation ID for the [`ReservoirEvaluator`].
     ///
     /// These values are needed to synchronize with Redis.

diff --git a/relay-server/src/envelope.rs b/relay-server/src/envelope.rs
@@ -42,7 +42,7 @@ use bytes::Bytes;
 use chrono::{DateTime, Utc};
 use relay_dynamic_config::ErrorBoundary;
 use relay_event_normalization::{normalize_transaction_name, TransactionNameRule};
-use relay_event_schema::protocol::{EventId, EventType};
+use relay_event_schema::protocol::{Event, EventId, EventType};
 use relay_protocol::{Annotated, Value};
 use relay_quotas::DataCategory;
 use relay_sampling::DynamicSamplingContext;
@@ -554,6 +554,20 @@ pub struct ItemHeaders {
     #[serde(default, skip_serializing_if = "is_false")]
     metrics_extracted: bool,
 
+    /// Whether or not a transaction has been extracted from a segment span.
+    #[serde(default, skip_serializing_if = "is_false")]
+    transaction_extracted: bool,
+
+    /// Whether or not spans and span metrics have been extracted from a transaction.
+    ///
+    /// This header is set to `true` after both span extraction and span metrics extraction,
+    /// and can be used to skip extraction.
+    ///
+    /// NOTE: This header is also set to `true` for transactions that are themselves extracted
+    /// from spans (the opposite direction), to prevent going in circles.
+    #[serde(default, skip_serializing_if = "is_false")]
+    spans_extracted: bool,
+
     /// `false` if the sampling decision is "drop".
     ///
     /// In the most common use case, the item is dropped when the sampling decision is "drop".
@@ -628,6 +642,8 @@ impl Item {
                 sample_rates: None,
                 other: BTreeMap::new(),
                 metrics_extracted: false,
+                transaction_extracted: false,
+                spans_extracted: false,
                 sampled: true,
             },
             payload: Bytes::new(),
@@ -833,6 +849,26 @@ impl Item {
         self.headers.metrics_extracted = metrics_extracted;
     }
 
+    /// Returns the transaction extracted flag.
+    pub fn transaction_extracted(&self) -> bool {
+        self.headers.transaction_extracted
+    }
+
+    /// Sets the transaction extracted flag.
+    pub fn set_transaction_extracted(&mut self, transaction_extracted: bool) {
+        self.headers.transaction_extracted = transaction_extracted;
+    }
+
+    /// Returns the spans extracted flag.
+    pub fn spans_extracted(&self) -> bool {
+        self.headers.spans_extracted
+    }
+
+    /// Sets the spans extracted flag.
+    pub fn set_spans_extracted(&mut self, spans_extracted: bool) {
+        self.headers.spans_extracted = spans_extracted;
+    }
+
     /// Gets the `sampled` flag.
     pub fn sampled(&self) -> bool {
         self.headers.sampled
@@ -1060,6 +1096,21 @@ impl Envelope {
         Box::new(Self { items, headers })
     }
 
+    /// Creates an envelope from headers and an envelope.
+    pub fn try_from_event(
+        mut headers: EnvelopeHeaders,
+        event: Event,
+    ) -> Result<Box<Self>, serde_json::Error> {
+        headers.event_id = event.id.value().copied();
+        let event_type = event.ty.value().copied().unwrap_or_default();
+
+        let serialized = Annotated::new(event).to_json()?;
+        let mut item = Item::new(ItemType::from_event_type(event_type));
+        item.set_payload(ContentType::Json, serialized);
+
+        Ok(Self::from_parts(headers, smallvec::smallvec![item]))
+    }
+
     /// Creates an envelope from request information.
     pub fn from_request(event_id: Option<EventId>, meta: RequestMeta) -> Box<Self> {
         Box::new(Self {

diff --git a/relay-server/src/extractors/request_meta.rs b/relay-server/src/extractors/request_meta.rs
@@ -316,6 +316,11 @@ impl<D> RequestMeta<D> {
     pub fn is_from_internal_relay(&self) -> bool {
         self.from_internal_relay
     }
+
+    /// Overwrite internal property.
+    pub fn set_from_internal_relay(&mut self, value: bool) {
+        self.from_internal_relay = value;
+    }
 }
 
 impl RequestMeta {
-Original file line number
+Diff line change
@@ Expand Up / @@ -42,6 +42,11 @@ fn set_event_exclusive_time( @@
             return;
         };
+        if trace_context.exclusive_time.value().is_some() {
+            // Exclusive time already set, respect it.
+            return;
+        }
         let Some(span_id) = trace_context.span_id.value() else {
             return;
         };
@@ Expand Down @@