feat(spans): Extract transaction from segment span #3375

jjbayer · 2024-04-04T09:30:56Z

The current state of Performance is that

There are SDKs that emit transactions, and there are SDKs that emit standalone spans.
There are parts of the product that require transactions, and there are parts of the product that require spans.

We already extract spans from transactions to make span-dependent product features work for "transaction"-SDKs. Now we need to also extract transactions from spans to make transaction-dependent product features work for "span"-SDKs:

flowchart LR
    
    SDK -->|transaction| Relay
    SDK -->|span| Relay
    Relay -->|span| spc[Span Consumer]
    Relay -->|transaction| txc[Transaction Consumer] 

    Relay -->|span from transaction| spc
    Relay -->|transaction from span| txc
    
    linkStyle 5 color:green;

Point of conversion

When to actually convert the segment span to transaction? I've listed two options below. At the moment, I believe only Option 1 is feasible because the span processing pipeline does not have all features implemented yet.

Option 1 - At the start of envelope processing [Selected (see Option 1a)]

flowchart TD
SDK -->|span| extract
extract-->|span| process_standalone_span
extract-->|transaction| process_transaction

Pros:

No need to duplicate code for e.g. transaction metrics extraction. An extracted transaction can pass through the same processing pipeline as an "organic" transaction sent from the SDK, with the same normalization, PII scrubbing, etc.
The transaction processing pipeline is more mature and better tested than the span processing pipeline.

Cons:

Some duplicate work: Normalization and PII scrubbing will run for both the original segment span and the transaction extracted from it.
Inconsistent with how we extract spans from transactions (this is done after processing).
Extraction will occur in edge Relays (not just processing Relays), so any updates to the conversion would take months to propagate to external Relays.

Option 1a - At the start of span processing in processing Relays [Selected]

Like Option 1, but only done in processing relays:

Pros:

Gets rid of one of the Cons of Option 1.
Spans are currently only parsed in processing Relays. No need to refactor that.

Cons:

Need to make sure that transactions are normalized, even if normalization is disabled in processing Relays.

Option 2 - At the end of envelope processing (in processing Relays) [Discarded]

flowchart TD

process_span["process span (normalize, filter, metrics, sample)"]

SDK -->|span| process_span
process_span -->|span| extract_transaction
extract_transaction -->|span| enforce_quotas
extract_transaction -->|transaction| extract_metrics_tx
extract_metrics_tx -->|transaction| enforce_quotas

Pros:

Assuming that span processing already filters, normalizes, samples and scrubs spans correctly, there would be no duplicate work done for the extracted transaction. All that's left would be transaction metrics extraction and rate limiting
Consistent with how we extract spans from transactions.

Cons:

Cannot leverage the fully mature transaction processing pipeline.
- BLOCKING: Inbound filters and dynamic sampling for spans are not ready yet.
Needs some duplicate code to extract transaction metrics from extracted transactions.

Prevent duplicate data

We already cross the spans/transactions in two places:

For every transaction, Relay extracts one standalone span for the transaction's child spans, and a standalone segment span for the transaction itself.
For compatibility of performance scores, there is one transaction metric that is also extracted from standalone spans: "d:transactions/measurements.score.total@ratio".

To prevent circular conversion of data, I suggest to introduce two new item headers:

"transaction_extracted" for span items, which will be checked before converting a span to a transaction. For segment spans extracted from transactions, this flag will be true from the start.
"spans_extracted" for transaction items, which will be checked before extracting spans or span metrics from a transaction. For transactions extracted from spans, this flag will be true from the start.

In addition, we will stop extracting "d:transactions/measurements.score.total@ratio" from spans.

TODO

Ensure transaction gets normalized even if normalization is disabled
Test with a local dev setup to make sure transactions appear in the product without breaking consumers.
Modify test input to make is_segment false for some.
Add a test with score.total to ensure that it is never extracted more than once.

…action

jjbayer · 2024-04-09T12:54:49Z

relay-event-schema/src/protocol/span/convert.rs

+                                if has_fields {
+                                    let context_key = <$ContextType as DefaultContext>::default_key().into();
+                                    contexts.insert(context_key, ContextInner(context.into_context()).into());
+                                }


This prevents an empty ProfileContext from appearing in the event.

jjbayer · 2024-04-09T12:56:27Z

relay-event-normalization/src/normalize/span/exclusive_time.rs

+    if trace_context.exclusive_time.value().is_some() {
+        // Exclusive time already set, respect.
+        return;
+    }


This is a necessary change in behavior: A transaction derived from a standalone span does not have child spans, so we need to keep the exclusive_time set on the transaction, otherwise exclusive_time will always equal the full duration.

jjbayer · 2024-04-09T12:59:10Z

relay-sampling/src/evaluation.rs

+    /// Returns a shared reference to the reservoir counters.
+    pub fn counters(&self) -> ReservoirCounters {
+        self.counters.clone()
+    }


Needed to split off a ProcessEnvelope.

jjbayer · 2024-04-09T13:00:06Z

relay-server/src/envelope.rs

+
+    /// Whether or not spans have been extracted from a transaction.
+    #[serde(default, skip_serializing_if = "is_false")]
+    spans_extracted: bool,


With these two flags, we can prevent conversion between events and spans from going in circles.

jjbayer · 2024-04-09T13:00:53Z

relay-server/src/metrics_extraction/event.rs

    config: &MetricExtractionConfig,
    max_tag_value_size: usize,
 ) -> Vec<Bucket> {
    let mut metrics = generic::extract_metrics(event, config);

+    // If spans were already extracted for an event,
+    // we rely on span processing to extract metrics.
+    if !spans_extracted {


Note to self: Double-check if semantics make sense.

Non-functional change to get rid of `#[allow(clippy::too_many_arguments)]` on `EnvelopeProcessorService::new`. This PR was originally part of #3375, which adds yet another `Addr`, but I decided to make a separate PR for reviewability.

jjbayer · 2024-04-09T15:54:29Z

relay-server/src/services/processor/event.rs

@@ -383,6 +384,9 @@ pub fn serialize<G: EventProcessing>(
    // If transaction metrics were extracted, set the corresponding item header
    event_item.set_metrics_extracted(state.event_metrics_extracted);

+    // TODO: The state should simply maintain & update an `ItemHeaders` object.


I will follow up on this in a different PR.

jjbayer · 2024-04-09T15:58:24Z

Still requires some more test coverage, but opening for review to get feedback.

See getsentry/relay#3375. ref: getsentry/relay#3278

iker-barriocanal

Overall lgtm. Leaving a few comments and questions:

How are we planning to measure COGS for transactions coming from segments?
A single message with an envelope with N spans will result in N new messages in the processor, and we'd drop messages if the processor's queue is full. Solving that problem is out of scope of this PR, but what do you think about adding observability for that? This queue can grow quite fast without accepting new requests.

iker-barriocanal · 2024-04-10T14:13:23Z

relay-server/src/services/processor/span/processing.rs

@@ -91,10 +98,18 @@ pub fn process(
            return ItemAction::Drop(Outcome::Invalid(DiscardReason::Internal));
        };

+        if should_extract_transactions && !item.transaction_extracted() {
+            if let Some(transaction) = convert_to_transaction(&annotated_span) {


Span and transaction normalization may have different requirements, so I'd extract transactions before normalizing spans (line 91 above).

Good point, refactored now. convert_to_transaction relies on is_segment normalization, so I moved that part out of normalization to still run before.

relay-server/src/services/processor/span/processing.rs

relay-sampling/src/evaluation.rs

tests/integration/test_spans.py

iker-barriocanal · 2024-04-10T14:33:17Z

tests/integration/test_spans.py

@@ -556,20 +580,46 @@ def test_span_ingestion(
            "description": "my 3rd protobuf OTel span",
            "duration_ms": 500,
            "exclusive_time_ms": 500.0,
-            "is_segment": True,
+            "is_segment": False,


Why is this span no longer a segment?

I deliberately gave it a parent_span_id so I could verify that transactions are not extracted from regular spans, only from segment spans.

…action

jjbayer

@iker-barriocanal

How are we planning to measure COGS for transactions coming from segments?

The extraction part will be accounted as AppFeature::Spans, the processing as AppFeature::Transactions. I think this is fine for now, as both are part of Performance cost.

A single message with an envelope with N spans will result in N new messages in the processor, and we'd drop messages if the processor's queue is full. Solving that problem is out of scope of this PR, but what do you think about adding observability for that? This queue can grow quite fast without accepting new requests.

Good point, I added two metrics now so we can observe the number of spin-off transactions per envelope.

relay-server/src/envelope.rs

relay-server/src/services/processor/span/processing.rs

jjbayer · 2024-04-11T09:15:42Z

relay-server/src/services/processor/span/processing.rs

@@ -91,10 +98,18 @@ pub fn process(
            return ItemAction::Drop(Outcome::Invalid(DiscardReason::Internal));
        };

+        if should_extract_transactions && !item.transaction_extracted() {
+            if let Some(transaction) = convert_to_transaction(&annotated_span) {


Good point, refactored now. convert_to_transaction relies on is_segment normalization, so I moved that part out of normalization to still run before.

tests/integration/test_spans.py

jjbayer · 2024-04-11T09:19:36Z

tests/integration/test_spans.py

@@ -556,20 +580,46 @@ def test_span_ingestion(
            "description": "my 3rd protobuf OTel span",
            "duration_ms": 500,
            "exclusive_time_ms": 500.0,
-            "is_segment": True,
+            "is_segment": False,


I deliberately gave it a parent_span_id so I could verify that transactions are not extracted from regular spans, only from segment spans.

…action

iker-barriocanal · 2024-04-11T10:09:24Z

relay-server/src/statsd.rs

@@ -382,6 +382,9 @@ pub enum RelayTimers {
    /// This metric is tagged with:
    ///  - `type`: The type of the health check, `liveness` or `readiness`.
    HealthCheckDuration,
+
+    /// Measurees how many transactions were created from segment spans in a single envelope.


Suggested change

/// Measurees how many transactions were created from segment spans in a single envelope.

/// Measures how many transactions were created from segment spans in a single envelope.

iker-barriocanal · 2024-04-11T10:12:47Z

relay-server/src/statsd.rs

@@ -382,6 +382,9 @@ pub enum RelayTimers {
    /// This metric is tagged with:
    ///  - `type`: The type of the health check, `liveness` or `readiness`.
    HealthCheckDuration,
+
+    /// Measurees how many transactions were created from segment spans in a single envelope.
+    TransactionsFromSpansPerEnvelope,


Should we move the variant to RelayHistograms?

relay-event-normalization/src/normalize/span/exclusive_time.rs

relay-sampling/src/evaluation.rs

relay-server/src/metrics_extraction/event.rs

relay-server/src/services/processor/span/processing.rs

relay-server/src/envelope.rs

relay-server/src/services/processor/span/processing.rs

Co-authored-by: David Herberth <david.herberth@sentry.io>

jjbayer · 2024-04-11T14:35:37Z

Update: verified that a transaction extracted from spans ends up in the product without errors:

See getsentry/relay#3375. ref: getsentry/relay#3278

jjbayer added 5 commits April 4, 2024 11:18

ref: Drive-by refactor

d907bfd

wip

aa8df48

Compiling envelope copy

e16ad3a

test: See test fail because of duplicate spans

0f1526d

fix: Prevent duplicate span ingestion

b8177bc

Dav1dde assigned jjbayer Apr 5, 2024

jjbayer added 7 commits April 8, 2024 14:40

fix: extract after normalize

3896772

fix: No empty profile context

39f7005

Merge remote-tracking branch 'origin/master' into feat/spans-to-trans…

30d1bc1

…action

fix: span metrics

dd2af1e

test

bb6467b

fix: test & lint

5220d1d

Merge branch 'master' into feat/spans-to-transaction

9461d2a

jjbayer commented Apr 9, 2024

View reviewed changes

cleanup

800ff44

jjbayer force-pushed the feat/spans-to-transaction branch from e6c71c7 to 800ff44 Compare April 9, 2024 13:04

jjbayer mentioned this pull request Apr 9, 2024

ref(server): Put processor Addrs in separate struct #3399

Merged

jjbayer added 2 commits April 9, 2024 15:59

Merge branch 'master' into feat/spans-to-transaction

3b5189f

Merge branch 'master' into feat/spans-to-transaction

d13abf8

jjbayer commented Apr 9, 2024

View reviewed changes

jjbayer marked this pull request as ready for review April 9, 2024 15:58

jjbayer requested a review from a team as a code owner April 9, 2024 15:58

jjbayer added 4 commits April 10, 2024 12:08

test

3247cd6

Adapt test

ac29ffe

Merge branch 'master' into feat/spans-to-transaction

a6b15d3

fix: no duplicate transaction metric

d31d662

jjbayer mentioned this pull request Apr 10, 2024

feat(spans): Register feature for transaction extraction getsentry/sentry#68609

Merged

jjbayer added a commit to getsentry/sentry that referenced this pull request Apr 10, 2024

feat(spans): Register feature for transaction extraction (#68609)

3bb41bc

See getsentry/relay#3375. ref: getsentry/relay#3278

iker-barriocanal reviewed Apr 10, 2024

View reviewed changes

jjbayer added 3 commits April 11, 2024 08:51

Merge remote-tracking branch 'origin/master' into feat/spans-to-trans…

fbdffe7

…action

ref: review comments

bac3566

more instr

7a9fcef

jjbayer commented Apr 11, 2024

View reviewed changes

Merge remote-tracking branch 'origin/master' into feat/spans-to-trans…

8d55e8d

…action

iker-barriocanal approved these changes Apr 11, 2024

View reviewed changes

Dav1dde reviewed Apr 11, 2024

View reviewed changes

jjbayer and others added 6 commits April 11, 2024 13:21

Update relay-event-normalization/src/normalize/span/exclusive_time.rs

ef0c15c

Co-authored-by: David Herberth <david.herberth@sentry.io>

fix: metric in wrong place

34c8012

fix: always normalize

469e96c

clean

0cf16a2

fix: Get permit for spin-off envelope

e0fae23

more fixes

6580ad8

Dav1dde approved these changes Apr 11, 2024

View reviewed changes

jjbayer merged commit f71e136 into master Apr 11, 2024
21 checks passed

jjbayer deleted the feat/spans-to-transaction branch April 11, 2024 13:58

c298lee pushed a commit to getsentry/sentry that referenced this pull request Apr 12, 2024

feat(spans): Register feature for transaction extraction (#68609)

9ed2304

See getsentry/relay#3375. ref: getsentry/relay#3278

jjbayer mentioned this pull request Apr 16, 2024

ref(server): Maintain copy of event item headers #3414

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(spans): Extract transaction from segment span #3375

feat(spans): Extract transaction from segment span #3375

jjbayer commented Apr 4, 2024 •

edited

Loading

jjbayer Apr 9, 2024

jjbayer Apr 9, 2024

jjbayer Apr 9, 2024

jjbayer Apr 9, 2024

jjbayer Apr 9, 2024

jjbayer Apr 9, 2024

jjbayer commented Apr 9, 2024

iker-barriocanal left a comment

iker-barriocanal Apr 10, 2024

jjbayer Apr 11, 2024

iker-barriocanal Apr 10, 2024

jjbayer Apr 11, 2024

jjbayer left a comment

jjbayer Apr 11, 2024

jjbayer Apr 11, 2024

iker-barriocanal Apr 11, 2024

iker-barriocanal Apr 11, 2024

jjbayer commented Apr 11, 2024

	/// Measurees how many transactions were created from segment spans in a single envelope.
	/// Measures how many transactions were created from segment spans in a single envelope.

feat(spans): Extract transaction from segment span #3375

feat(spans): Extract transaction from segment span #3375

Conversation

jjbayer commented Apr 4, 2024 • edited Loading

Point of conversion

Option 1 - At the start of envelope processing [Selected (see Option 1a)]

Option 1a - At the start of span processing in processing Relays [Selected]

Option 2 - At the end of envelope processing (in processing Relays) [Discarded]

Prevent duplicate data

TODO

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jjbayer commented Apr 9, 2024

iker-barriocanal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jjbayer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jjbayer commented Apr 11, 2024

jjbayer commented Apr 4, 2024 •

edited

Loading