Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(normalization): Remove stale spans from transactions #2627

Merged
merged 8 commits into from
Oct 20, 2023
Merged
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

## Unreleased

**Features**:

- Remove from events spans starting or ending before January 1, 1970 UTC. ([#2627](https://github.com/getsentry/relay/pull/2627))

**Internal**:

- Restrict resource spans to script and css only. ([#2623](https://github.com/getsentry/relay/pull/2623))
Expand Down
1 change: 1 addition & 0 deletions py/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

- Add `scraping_attempts` field to the event schema. ([#2575](https://github.com/getsentry/relay/pull/2575))
- Drop events starting or ending before January 1, 1970 UTC. ([#2613](https://github.com/getsentry/relay/pull/2613))
- Remove from events spans starting or ending before January 1, 1970 UTC. ([#2627](https://github.com/getsentry/relay/pull/2627))

## 0.8.31

Expand Down
138 changes: 132 additions & 6 deletions relay-event-normalization/src/timestamp.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
use relay_event_schema::processor::{
ProcessingAction, ProcessingResult, ProcessingState, Processor,
ProcessValue, ProcessingAction, ProcessingResult, ProcessingState, Processor,
};
use relay_event_schema::protocol::Event;
use relay_protocol::Meta;
use relay_event_schema::protocol::{Event, Span};
use relay_protocol::{Error, Meta};

/// Ensures an event's timestamps are not stale.
///
Expand All @@ -22,7 +22,7 @@ impl Processor for TimestampProcessor {
&mut self,
event: &mut Event,
_: &mut Meta,
_: &ProcessingState,
state: &ProcessingState,
) -> ProcessingResult {
if let Some(end_timestamp) = event.timestamp.value() {
if end_timestamp.into_inner().timestamp_millis() < 0 {
Expand All @@ -39,15 +39,45 @@ impl Processor for TimestampProcessor {
}
}

event.process_child_values(self, state)?;

Ok(())
}

fn process_span(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs for the processor are up-to-date, I accidentally updated them in the previous PR.

&mut self,
span: &mut Span,
meta: &mut Meta,
_: &ProcessingState<'_>,
) -> ProcessingResult {
if let Some(start_timestamp) = span.start_timestamp.value() {
if start_timestamp.into_inner().timestamp_millis() < 0 {
meta.add_error(Error::invalid(format!(
"timestamp is too stale: {}",
start_timestamp
)));
return Err(ProcessingAction::DeleteValueHard);
}
}
if let Some(end_timestamp) = span.timestamp.value() {
if end_timestamp.into_inner().timestamp_millis() < 0 {
meta.add_error(Error::invalid(format!(
"timestamp is too stale: {}",
end_timestamp
)));
return Err(ProcessingAction::DeleteValueHard);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we put this logic in process_timestamp instead of process_span? That should automatically cover all timestamps, plus it would attach the meta error to the correct field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will deal with only the timestamp itself, but we want to take action on the container of the timestamp. For spans, this means dropping the span; for transactions, this means dropping the entire event. We can't have this level of control in process_timestamp unfortunately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explored introducing a new type of error to remove the parent container. This approach can work for calls on top-level containers like events, but it requires major refactors and decreases development experience a lot for child components like distributions. For that reason, I'm going to stick to the current approach.


Ok(())
}
}

#[cfg(test)]
mod tests {
use relay_event_schema::processor::{process_value, ProcessingState};
use relay_event_schema::protocol::Event;
use relay_protocol::Annotated;
use relay_event_schema::protocol::{Event, Span};
use relay_protocol::{assert_annotated_snapshot, Annotated};

use crate::timestamp::TimestampProcessor;

Expand All @@ -61,6 +91,17 @@ mod tests {
assert!(
process_value(&mut error, &mut TimestampProcessor, ProcessingState::root()).is_ok()
);
assert_eq!(
error
.value()
.unwrap()
.timestamp
.value()
.unwrap()
.into_inner()
.timestamp(),
1
);
}

#[test]
Expand Down Expand Up @@ -133,4 +174,89 @@ mod tests {
"invalid transaction event: timestamp is too stale"
);
}

#[test]
fn test_accept_recent_span() {
let json = r#"{
"span_id": "52df9022835246eeb317dbd739ccd050",
"start_timestamp": 1,
"timestamp": 2
}"#;
let mut span = Annotated::<Span>::from_json(json).unwrap();
assert!(process_value(&mut span, &mut TimestampProcessor, ProcessingState::root()).is_ok());
assert_eq!(
span.value()
.unwrap()
.start_timestamp
.value()
.unwrap()
.into_inner()
.timestamp(),
1
);
assert_eq!(
span.value()
.unwrap()
.timestamp
.value()
.unwrap()
.into_inner()
.timestamp(),
2
);
}

#[test]
fn test_reject_stale_span() {
let json = r#"{
"span_id": "52df9022835246eeb317dbd739ccd050",
"start_timestamp": -2,
"timestamp": -1
}"#;
let mut span = Annotated::<Span>::from_json(json).unwrap();
assert!(process_value(&mut span, &mut TimestampProcessor, ProcessingState::root()).is_ok());
assert_annotated_snapshot!(&span, @r###"
{
"_meta": {
"": {
"err": [
[
"invalid_data",
{
"reason": "timestamp is too stale: 1969-12-31 23:59:58 UTC"
}
]
]
}
}
}
"###);
}

#[test]
fn test_reject_long_running_span() {
let json = r#"{
"span_id": "52df9022835246eeb317dbd739ccd050",
"start_timestamp": -1,
"timestamp": 1
}"#;
let mut span = Annotated::<Span>::from_json(json).unwrap();
assert!(process_value(&mut span, &mut TimestampProcessor, ProcessingState::root()).is_ok());
assert_annotated_snapshot!(&span, @r###"
{
"_meta": {
"": {
"err": [
[
"invalid_data",
{
"reason": "timestamp is too stale: 1969-12-31 23:59:59 UTC"
}
]
]
}
}
}
"###);
}
}