Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(store): Limit custom measurements [INGEST-1621] #1483

Merged
merged 12 commits into from
Sep 22, 2022

Conversation

jjbayer
Copy link
Member

@jjbayer jjbayer commented Sep 19, 2022

If measurementsConfig is provided, limit the number of custom measurements that an event can contain. The distinction between built-in and custom measurements is made through an explicit allowlist in the project config.

Once we have a global relay config propagation mechanism, the allowlist should be moved there, but we should not hard-code it in Relay, because (light) normalization also happens in customer relays, which means that if we added a new built-in measurement, an outdated external relay would count it towards custom measurements.

To be done in follow-up PRs:

@jjbayer jjbayer marked this pull request as ready for review September 19, 2022 13:39
@jjbayer jjbayer requested a review from a team September 19, 2022 13:39
@jjbayer jjbayer assigned jjbayer and iker-barriocanal and unassigned jjbayer Sep 20, 2022
Copy link
Contributor

@iker-barriocanal iker-barriocanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about the approach for a while, and I don't really like it, although I think it's fine for now (especially considering the scope of the work).

I don't think using measurement names is a strong approach in the state of the art. We still don't have any support for type/unit values, so users may be smart and try to add their custom measurements with other types and/or values, generating a name conflict. Another possibility is someone not knowing about the existence of a built-in metric and using its name for what they think is a custom metric in a term specific to their business.

In such cases of name conflict, relay will extract the built-in measurement with matching name, and completely ignore the measurement they want to extract. I rather an alternative approach in which we match the full MRI and not just the name. Again, I believe the scope of the approach I suggest is too big for now, but I'll still leave this written down for future reference.

relay-server/src/actors/project.rs Show resolved Hide resolved
pub struct MeasurementsConfig {
/// A list of measurements that are built-in and are not subject to custom measurement limits.
#[serde(default, skip_serializing_if = "BTreeSet::<String>::is_empty")]
known_measurements: BTreeSet<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same suggestion from sentry's PR: I'd rename this to something similar to builtin_measurements in order to have one name per concept (and we use "built-in" quite extensively in general).

relay-general/src/store/normalize.rs Show resolved Hide resolved
relay-general/src/store/normalize.rs Outdated Show resolved Hide resolved
relay-general/src/store/normalize.rs Show resolved Hide resolved
@@ -221,12 +236,42 @@ fn normalize_breakdowns(event: &mut Event, breakdowns_config: Option<&Breakdowns
}
}

/// Ensure measurements interface is only present for transaction events
fn normalize_measurements(event: &mut Event) {
/// Enforce the limit on custom (user defined) measurements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit / personal opinion: the function accepts a Measurements object that may contain built-in measurements (in fact, the function does get built-in measurements, and that's the first thing checked to decide if we should keep a measurement or not), so it's not really enforcing the limit on custom measurements but on all measurements. Personally, I'd generalize this line, and mention in another paragraph that all built-in measurements are accepted and we limit the custom ones by the project config.

Suggested change
/// Enforce the limit on custom (user defined) measurements.
/// Enforce the limit on measurements.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed the function and updated the doc comment to clarify what it does.

@jjbayer
Copy link
Member Author

jjbayer commented Sep 20, 2022

We still don't have any support for type/unit values, so users may be smart and try to add their custom measurements with other types and/or values, generating a name conflict.

@iker-barriocanal My initial draft contained an allowlist of (name, unit) pairs. I agree that would be more solid, so I'll see if I can quickly implement that. Note that the type of a measurement (in terms of metrics types) is always distribution.

In such cases of name conflict, relay will extract the built-in measurement with matching name, and completely ignore the measurement they want to extract.

As discussed offline, I think this is already a problem in SDKs. Because the measurements entry of the event payload is a key/value mapping, only one measurement will survive for each name.

Comment on lines +271 to +272
// If the name matches but the unit is wrong, we do not even accept it as a custom measurement,
// and just drop it instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

Comment on lines 292 to 294
} else if let Some(measurements) = event.measurements.value_mut() {
if let Some(measurements_config) = measurements_config {
remove_invalid_measurements(measurements, measurements_config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The measurements config is an Option and may be None, and no measurements would be removed in that case. Considering that we're removing measurements to protect the infrastructure, is it acceptable to not remove measurements if relay doesn't get that part of the config?

The same problem of the scope comes again here. No strong opinions, but I was wondering if we should drop all measurements if we don't get that part of the config. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config is None when the upstream sentry is not up-to-date. In that case, IMO Relay should behave as it used to, that is, accept all measurements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have an incident and upstream sentry is unavailable for some time, relays will accept all measurements. I'm not concerned about the speed to ingest these once we're back or the consistency of some transactions having more measurements than others, but about the risk of blowing up storage if the incident takes a long time and other unexpected consequences we're currently not aware of. I'd rather to have a limit always in place and improve from that point.

That said, the PR is approved and this doesn't block landing it from my point of view, but I believe protecting the infrastructure in production is more important than everything else.

jjbayer added a commit to getsentry/sentry that referenced this pull request Sep 21, 2022
In tandem with getsentry/relay#1483, provide a
new project config key that defines how many custom measurements an
event may contain.

To distinguish between custom and built-in measurements, the latter are
explicitly enumerated as part of project config. This means considerable
overhead for every project config, but I believe we need the flexibility
here (see linked Relay PR).
jjbayer added a commit that referenced this pull request Sep 22, 2022
The allowlist introduced in #1483 strictly checks the measurement unit.
For that to work, we need to normalize the event unit first. Until now,
unit normalization happened in metrics extraction. This PR moves it into
(light) normalization.
@jjbayer jjbayer merged commit 9ccd339 into master Sep 22, 2022
@jjbayer jjbayer deleted the feat/limit-custom-measurements branch September 22, 2022 11:41
jjbayer added a commit that referenced this pull request Sep 27, 2022
With custom measurements limited by the `measurements` config in event
normalization (see #1483), we do not need the special allowlist for
transaction metrics anymore.

config format to support old Relay instances. We will need a bump of
`transactionMetrics.VERSION` to finally clean up that part.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants