-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util: Send only last metric value to Graphite #31829
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 3 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/util/metric/graphite_exporter.go, line 73 at r1 (raw file):
} err = b.Push() ge.pm.clearMetrics()
What is the desired behavior here in the presence of errors? That we clear the metrics regardless of any error? That we clear the metrics regardless of error as long as that error came from b.Push
? If the intention is the latter, which is what the code currently implements, then I'd suggest defer
ing the clear call and still returning b.Push()
. Regardless, let's be deliberate about this and comment our intention.
pkg/util/metric/prometheus_exporter.go, line 92 at r1 (raw file):
return err } // Clear metrics for reuse.
We can replace this with a call to clearMetrics
after the loop, right?
pkg/util/metric/prometheus_exporter.go, line 115 at r1 (raw file):
func (pm *PrometheusExporter) clearMetrics() { for _, family := range pm.families { family.Metric = []*prometheusgo.Metric{}
We can just set it to nil
to avoid the allocation if the family never gets any metrics.
Cockroach was emitting a ridiculously large number of metrics per minute because an instance of PrometheusExporter was being reused without clearing it after every iteration. Release note: Fix bug in graphite metrics sender where cockroach was collecting and sending all data points since startup instead of only the latest data point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing @nvanbenschoten. I've updated.
RFAL
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/util/metric/graphite_exporter.go, line 73 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
What is the desired behavior here in the presence of errors? That we clear the metrics regardless of any error? That we clear the metrics regardless of error as long as that error came from
b.Push
? If the intention is the latter, which is what the code currently implements, then I'd suggestdefer
ing the clear call and still returningb.Push()
. Regardless, let's be deliberate about this and comment our intention.
Yes, we clear regardless of errors. I have made it defer
and added a lengthy comment. Please ping if comment is not clear.
pkg/util/metric/prometheus_exporter.go, line 92 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We can replace this with a call to
clearMetrics
after the loop, right?
Done.
pkg/util/metric/prometheus_exporter.go, line 115 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We can just set it to
nil
to avoid the allocation if the family never gets any metrics.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks Neeral! We'll backport this to 2.1.1.
bors r+
Reviewed 2 of 2 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
31829: util: Send only last metric value to Graphite r=nvanbenschoten a=neeral Cockroach was emitting a ridiculously large number of metrics per minute because an instance of PrometheusExporter was being reused without clearing it after every iteration. Release note: Fix bug in graphite metrics sender where cockroach was collecting and sending all data points since startup instead of only the latest data point. 31837: closedts: bump kv.closed_timestamp.target_duration default r=nvanbenschoten a=nvanbenschoten This change bumps the closed timestamp target duration to 30s. We're not using closed timestamps, except in special cases in 2.1, so it doesn't make sense for the setting to be so low if it risks pushing live transactions. Release note: None Co-authored-by: neeral <neeral@users.noreply.github.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Build succeeded |
Cockroach was emitting a ridiculously large number of metrics per minute
because an instance of PrometheusExporter was being reused without
clearing it after every iteration.
Release note:
Fix bug in graphite metrics sender where cockroach was collecting and
sending all data points since startup instead of only the latest data
point.