Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: Send only last metric value to Graphite #31829

Merged
merged 1 commit into from
Oct 24, 2018

Conversation

neeral
Copy link
Contributor

@neeral neeral commented Oct 24, 2018

Cockroach was emitting a ridiculously large number of metrics per minute
because an instance of PrometheusExporter was being reused without
clearing it after every iteration.

Release note:
Fix bug in graphite metrics sender where cockroach was collecting and
sending all data points since startup instead of only the latest data
point.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/util/metric/graphite_exporter.go, line 73 at r1 (raw file):

	}
	err = b.Push()
	ge.pm.clearMetrics()

What is the desired behavior here in the presence of errors? That we clear the metrics regardless of any error? That we clear the metrics regardless of error as long as that error came from b.Push? If the intention is the latter, which is what the code currently implements, then I'd suggest defering the clear call and still returning b.Push(). Regardless, let's be deliberate about this and comment our intention.


pkg/util/metric/prometheus_exporter.go, line 92 at r1 (raw file):

			return err
		}
		// Clear metrics for reuse.

We can replace this with a call to clearMetrics after the loop, right?


pkg/util/metric/prometheus_exporter.go, line 115 at r1 (raw file):

func (pm *PrometheusExporter) clearMetrics() {
	for _, family := range pm.families {
		family.Metric = []*prometheusgo.Metric{}

We can just set it to nil to avoid the allocation if the family never gets any metrics.

Cockroach was emitting a ridiculously large number of metrics per minute
because an instance of PrometheusExporter was being reused without
clearing it after every iteration.

Release note:
Fix bug in graphite metrics sender where cockroach was collecting and
sending all data points since startup instead of only the latest data
point.
Copy link
Contributor Author

@neeral neeral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing @nvanbenschoten. I've updated.
RFAL

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/util/metric/graphite_exporter.go, line 73 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

What is the desired behavior here in the presence of errors? That we clear the metrics regardless of any error? That we clear the metrics regardless of error as long as that error came from b.Push? If the intention is the latter, which is what the code currently implements, then I'd suggest defering the clear call and still returning b.Push(). Regardless, let's be deliberate about this and comment our intention.

Yes, we clear regardless of errors. I have made it defer and added a lengthy comment. Please ping if comment is not clear.


pkg/util/metric/prometheus_exporter.go, line 92 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We can replace this with a call to clearMetrics after the loop, right?

Done.


pkg/util/metric/prometheus_exporter.go, line 115 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We can just set it to nil to avoid the allocation if the family never gets any metrics.

Done.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: thanks Neeral! We'll backport this to 2.1.1.

bors r+

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

craig bot pushed a commit that referenced this pull request Oct 24, 2018
31829: util: Send only last metric value to Graphite r=nvanbenschoten a=neeral

Cockroach was emitting a ridiculously large number of metrics per minute
because an instance of PrometheusExporter was being reused without
clearing it after every iteration.

Release note:
Fix bug in graphite metrics sender where cockroach was collecting and
sending all data points since startup instead of only the latest data
point.

31837: closedts: bump kv.closed_timestamp.target_duration default r=nvanbenschoten a=nvanbenschoten

This change bumps the closed timestamp target duration to 30s. We're not
using closed timestamps, except in special cases in 2.1, so it doesn't make
sense for the setting to be so low if it risks pushing live transactions.

Release note: None

Co-authored-by: neeral <neeral@users.noreply.github.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig
Copy link
Contributor

craig bot commented Oct 24, 2018

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants