-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make intrinsic dimensions configurable and disable status_message by default #1960
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. If you can re-generate docs/tempo/website/configuration/manifest.md
, that'd be great. It's due an update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts.
@@ -1152,6 +1168,7 @@ overrides: | |||
[metrics_generator_processor_service_graphs_histogram_buckets: <list of float>] | |||
[metrics_generator_processor_service_graphs_dimensions: <list of string>] | |||
[metrics_generator_processor_span_metrics_histogram_buckets: <list of float>] | |||
[metrics_generator_processor_span_metrics_intrinsic_dimensions: <map string to bool>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may want to add a quick note about what strings are legal as keys
@@ -43,3 +44,7 @@ func (m *mockOverrides) MetricsGeneratorProcessorSpanMetricsHistogramBuckets(use | |||
func (m *mockOverrides) MetricsGeneratorProcessorSpanMetricsDimensions(userID string) []string { | |||
return m.spanMetricsDimensions | |||
} | |||
|
|||
func (m *mockOverrides) MetricsGeneratorProcessorSpanMetricsIntrinsicDimensions(userID string) map[string]bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extend TestProcessorConfig_copyWithOverrides
to include the intrinsic dimensions?
Also might be worth adding a case to Test_instance_updateProcessors
to make sure the diff works with the intrinsics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sound good. I added both tests.
if cfg.IntrinsicDimensions.StatusMessage { | ||
labels = append(labels, dimStatusMessage) | ||
} | ||
intrinsicDimensions := labels[:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why copy the slice here? it's pointing to the same underlying array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was needed in order to detect collisions with configured intrinsic dimensions. This is no longer needed as dimensions are now also treated as collisions when the colliding intrinsic dimension is absent.
@@ -108,7 +128,7 @@ func (p *Processor) aggregateMetricsForSpan(svcName string, rs *v1.Resource, spa | |||
p.spanMetricsDurationSeconds.ObserveWithExemplar(registryLabelValues, latencySeconds, tempo_util.TraceIDToHexString(span.TraceId)) | |||
} | |||
|
|||
func sanitizeLabelNameWithCollisions(name string) string { | |||
func sanitizeLabelNameWithCollisions(name string, intrinsicDimensions []string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sanitizing attribute names only if they conflict with configured intrinsics has some weird properties. if someone configures status_message
as a normal attribute and then later adds the matching intrinsic then the name of the normal attribute label changes. this may break dashboards or queries b/c it changes the meaning of their metric labels. similar situations can occur when disabling intrinsic dimensions.
my instinct is to sanitize against all intrinsics even if they are not configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -97,6 +97,7 @@ Internal types are updated to use `scope` instead of `instrumentation_library`. | |||
* [BUGFIX] New wal file separator '+' for the NTFS filesystem and backward compatibility with the old separator ':' [#1700](https://github.com/grafana/tempo/pull/1700) (@kilian-kier) | |||
* [ENHANCEMENT] metrics-generator: extract `status_message` field from spans [#1786](https://github.com/grafana/tempo/pull/1786), [#1794](https://github.com/grafana/tempo/pull/1794) (@stoewer) | |||
* [ENHANCEMENT] metrics-generator: handle collisions between user defined and default dimensions [#1794](https://github.com/grafana/tempo/pull/1794) (@stoewer) | |||
* [ENHANCEMENT] metrics-generator: make intrinsic dimensions configurable and disable `status_message` by default [#1960](https://github.com/grafana/tempo/pull/1960) (@stoewer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe there's a breaking change here (or perhaps one in #1786) that should be mentioned. If the user has a configured normal attribute dimension called status_message
then it's name will change to __status_message
.
See below for comments on the renaming behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I think the actual breaking change was in #1794. I marked the change as breaking in the changelog.
MetricsGeneratorProcessorServiceGraphsDimensions []string `yaml:"metrics_generator_processor_service_graphs_dimensions" json:"metrics_generator_processor_service_graphs_dimensions"` | ||
MetricsGeneratorProcessorSpanMetricsHistogramBuckets []float64 `yaml:"metrics_generator_processor_span_metrics_histogram_buckets" json:"metrics_generator_processor_span_metrics_histogram_buckets"` | ||
MetricsGeneratorProcessorSpanMetricsDimensions []string `yaml:"metrics_generator_processor_span_metrics_dimensions" json:"metrics_generator_processor_span_metrics_dimensions"` | ||
MetricsGeneratorProcessorSpanMetricsIntrinsicDimensions map[string]bool `yaml:"metrics_generator_processor_span_metrics_intrinsic_dimensions" json:"metrics_generator_processor_span_metrics_intrinsic_dimensions"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we fail to load overrides if a value is specified here that is not an intrinsic dimension?
metrics_generator_processor_span_metrics_intrinsic_dimensions:
status_message: true
foo: false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not think of a good way to validate they intrinsic dimensions while loading the overrides without running into import cycles or repeating dimension definitions in the overrides package.
I solved it now by checking for valid keys when the overrides are applied.
6db8e29
to
eec42ea
Compare
Good idea. I regenerated it. |
The linters deadcode, structcheck, and varcheck are deprecated since golangci-lint v1.49.0. They are replace by unused which is already enabled in our config
Custom dimensions from span attributs are also treated as conflicts when the conflicting intrinsic dimension is not actually present. This prevents those dimensions from being renamed when the intrinsic dimension configuration changes
ba5aedd
to
6ac650a
Compare
47e95d3
to
9088882
Compare
416ffd8
to
027a59a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work
What this PR does:
The metrics generator always adds the intrinsic dimensions
service
,span_name
,span_kind
,status_code
, andstatus_message
. Thestatus_message
may contain arbitrary strings and can have a very high cardinality (see #1932). Therefore it is safer not to add this dimension by default.With this PR:
status_message
is not added by default.Which issue(s) this PR fixes:
Fixes #1932
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]