-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(metrics): Add received_at timestamp in bucket metadata #3488
Conversation
relay-config/src/config.rs
Outdated
} | ||
|
||
impl Default for SentryMetrics { | ||
fn default() -> Self { | ||
Self { | ||
meta_locations_expiry: 15 * 24 * 60 * 60, | ||
meta_locations_max: 5, | ||
override_received_at_metadata: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default we do not want to override the metadata to avoid unexpected behaviors due to the missing config param.
relay-config/src/config.rs
Outdated
@@ -524,13 +524,17 @@ struct SentryMetrics { | |||
/// | |||
/// Defaults to 5. | |||
pub meta_locations_max: usize, | |||
/// Whether to override the [`received_at`] field in the [`BucketMetadata`] with the current | |||
/// receive time of the instance. | |||
pub override_received_at_metadata: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In production, we would like to set this option to true only for pop relays, since we want to have the received_at
metadata field set to the outermost relay timestamp that receives a given bucket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have that handling in the processor as keep_metadata
for example in handle_process_metrics
which should already do the right thing.
@@ -56,13 +56,20 @@ where | |||
continue; | |||
}; | |||
|
|||
// For extracted metrics we assume the `received_at` timestamp is equivalent to the time | |||
// in which the metric is extracted. | |||
#[cfg(not(test))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of mocking time, we generate it differently if we are in test, in order to make the unit tests deterministic.
bucket.metadata = BucketMetadata::default(); | ||
} | ||
|
||
if override_received_at_metadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We purposefully override the metadata after the decision to keep it has been made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This piece of code could be optimized to avoid overwriting a field but I felt it's more comprehensible like this.
} | ||
|
||
#[tokio::test] | ||
async fn test_process_batched_metrics_bucket_metadata() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add integration tests in a follow up PR where we are also going to write the data to kafka.
relay-config/src/config.rs
Outdated
@@ -524,13 +524,17 @@ struct SentryMetrics { | |||
/// | |||
/// Defaults to 5. | |||
pub meta_locations_max: usize, | |||
/// Whether to override the [`received_at`] field in the [`BucketMetadata`] with the current | |||
/// receive time of the instance. | |||
pub override_received_at_metadata: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have that handling in the processor as keep_metadata
for example in handle_process_metrics
which should already do the right thing.
relay-metrics/src/bucket.rs
Outdated
let mut buckets = serde_json::from_str::<Vec<Bucket>>(json).unwrap(); | ||
buckets[0].metadata = BucketMetadata::new(UnixTimestamp::from_secs(1615889440)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just put this into the json, right?
tests/integration/test_metrics.py
Outdated
span_time = 9.910107 | ||
else: | ||
span_time = 9.910106 | ||
span_time = 9.910106 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests seem to succeed with the actual value 9.910106
.
tests/integration/test_metrics.py
Outdated
metrics[metric["name"]] = metric | ||
metrics["headers"][metric["name"]] = metric_headers | ||
|
||
metrics_consumer.assert_empty() | ||
return metrics | ||
|
||
|
||
def metrics_without_keys(received_metrics, keys): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decided to opt for a generic solution that can remove any key.
@@ -155,6 +155,34 @@ pub fn create_test_processor(config: Config) -> EnvelopeProcessorService { | |||
) | |||
} | |||
|
|||
pub fn create_test_processor_with_addrs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create_test_processor
should forward to this
This PR adds a new
received_at
field to theBucketMetadata
struct. The goal of this field is to eventually propagate it to Kafka to measure the end-to-end latency of our metrics pipeline. For this reason, thereceived_at
measurement must happen in the outermost internal Relay.The field
received_at
will be set/overridden by the outermost Relay via the optionkeep_metadata
which when set tofalse
will tell Relay to override the incoming metadata.The merge operation on two
received_at
timestamps is defined asmin(r1, r2)
if both timestamps are non-null, otherwise the first non-null value will be taken. If both values are null, the merge result will still yield null.Closes: https://github.com/getsentry/team-ingest/issues/265