Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: copy tags list before serializing #143

Merged
merged 5 commits into from
Jan 25, 2019
Merged

metrics: copy tags list before serializing #143

merged 5 commits into from
Jan 25, 2019

Conversation

bmoylan
Copy link
Contributor

@bmoylan bmoylan commented Jan 25, 2019

Fixes a bug introduced in #130 where the tags slice can be mutated by sort.Sort, affecting caller code unexpectedly.

This change is Reviewable

Copy link
Contributor

@ashrayjain ashrayjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe add the benchmark output to the PR just for posterity?

@bmoylan
Copy link
Contributor Author

bmoylan commented Jan 25, 2019

goos: darwin
goarch: amd64
pkg: github.com/palantir/pkg/metrics
BenchmarkRegisterMetric/1_tag-8         5000000      374 ns/op    256 B/op   5 allocs/op
BenchmarkRegisterMetric/10_tag-8        2000000      832 ns/op    848 B/op   5 allocs/op
BenchmarkRegisterMetric/100_tag-8        100000    13932 ns/op   7568 B/op   5 allocs/op
PASS
ok      github.com/palantir/pkg/metrics 6.466s

The new allocation shows up as expected

Copy link
Contributor

@nmiyake nmiyake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback on how to add the functionality

Reviewed 2 of 2 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @bmoylan)


metrics/registry.go, line 286 at r1 (raw file):

			tags = make(Tags, len(metricWithTags.tags))
			copy(tags, metricWithTags.tags)
			sort.Sort(tags)

Not directly related to this change, but it looks like this library has been updated to guarantee that tags are sorted, in which case the Sort call here seems unnecessary.


metrics/registry.go, line 344 at r1 (raw file):

func (r *rootRegistry) registerMetric(name string, tags Tags) string {
	metricID := toMetricTagsID(name, tags)

From what I can tell in inspecting the code, all of the call sites filter here. In that case, seems like it would make more sense to do the copy here rather than defining a new function and calling on all of the inputs:

// copy tag slice so that sort does not mutate it
tags = append(tags[:0:0], tags...)

(taken from 3rd example of https://github.com/golang/go/wiki/SliceTricks#copy -- see https://play.golang.org/p/wfnCRHRcgfp for verification that this construction works)

@bmoylan
Copy link
Contributor Author

bmoylan commented Jan 25, 2019


metrics/registry.go, line 344 at r1 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

From what I can tell in inspecting the code, all of the call sites filter here. In that case, seems like it would make more sense to do the copy here rather than defining a new function and calling on all of the inputs:

// copy tag slice so that sort does not mutate it
tags = append(tags[:0:0], tags...)

(taken from 3rd example of https://github.com/golang/go/wiki/SliceTricks#copy -- see https://play.golang.org/p/wfnCRHRcgfp for verification that this construction works)

Haha I had it down here originally (97c14c0) and @ashrayjain asked me to move it up so it would be more obvious to readers of the exported functions that the copy is happening for them.

Moved to the append style below but we should resolve this discussion of placement somehow (I'm relatively indifferent)

@bmoylan
Copy link
Contributor Author

bmoylan commented Jan 25, 2019


metrics/registry.go, line 286 at r1 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

Not directly related to this change, but it looks like this library has been updated to guarantee that tags are sorted, in which case the Sort call here seems unnecessary.

Removed

@ashrayjain
Copy link
Contributor

ashrayjain commented Jan 25, 2019

@nmiyake just pointing out from https://github.com/go101/go101/wiki/How-to-efficiently-clone-a-slice

If the slice element type is string or non-basic types, then the efficiency differences between the two ways are small. For some cases, the append way might be even slower.)

@nmiyake
Copy link
Contributor

nmiyake commented Jan 25, 2019


metrics/registry.go, line 344 at r1 (raw file):

Previously, bmoylan (Brad Moylan) wrote…

Haha I had it down here originally (97c14c0) and @ashrayjain asked me to move it up so it would be more obvious to readers of the exported functions that the copy is happening for them.

Moved to the append style below but we should resolve this discussion of placement somehow (I'm relatively indifferent)

I disagree -- feel pretty strongly that it should be here in registerMetric. That's where the common logic flows, so makes much more sense to centralize here.

For example, right now Histogram will copy the tags twice (because it delegates to HistogramWithSample, which also copies).

If we want clients to know that the slice will not be mutated, we can do that via function documentation. I think that code reading at that level is the wrong thing to optimize on (it's a bit more work, but someone reading the code can see that the only mutation happens after a copy).

@ashrayjain
Copy link
Contributor

@bmoylan @nmiyake i'm fine with keeping the copy in registerMetric.

@nmiyake
Copy link
Contributor

nmiyake commented Jan 25, 2019

I was recommending the slice copy notation mainly to allow it to be a one-liner (wasn't really concerned on the efficiency front), but ultimately I'm fine with anything. Main point of feedback is that I think it should be inline in the one location it's used rather than creating a new function for it.

Copy link
Contributor Author

@bmoylan bmoylan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 files reviewed, 2 unresolved discussions (waiting on @nmiyake)


metrics/registry.go, line 344 at r1 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

I disagree -- feel pretty strongly that it should be here in registerMetric. That's where the common logic flows, so makes much more sense to centralize here.

For example, right now Histogram will copy the tags twice (because it delegates to HistogramWithSample, which also copies).

If we want clients to know that the slice will not be mutated, we can do that via function documentation. I think that code reading at that level is the wrong thing to optimize on (it's a bit more work, but someone reading the code can see that the only mutation happens after a copy).

Updated

Copy link
Contributor

@nmiyake nmiyake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provided feedback, but I think the mutability/API thing could go either way, and it is somewhat nit-picky/bike-shedding, so I'm going to go ahead and approve the PR as-is -- you can either merge with the approval or do another round of PR.

May open up a follow-up later to float the suggested refactor, but since this is for unexported functions and won't be meaningfully different in either approach it's not a blocker.

Reviewable status: 1 of 2 files reviewed, 3 unresolved discussions (waiting on @bmoylan and @nmiyake)


metrics/registry.go, line 303 at r3 (raw file):

func (r *rootRegistry) Unregister(name string, tags ...Tag) {
	// We copy the tag slice so that subsequent in-place mutations (sorting) do not affect the input slice.
	tagsCopy := append(tags[:0:0], tags...) // https://github.com/go101/go101/wiki/How-to-efficiently-clone-a-slice%3F

Ah OK, I see that we're doing this in two places since we do both Register and Unregister. Don't like how the copy logic ends up getting duplicated with comment.


metrics/registry.go, line 346 at r3 (raw file):

func (r *rootRegistry) registerMetric(name string, tags Tags) string {
	// We copy the tag slice so that subsequent in-place mutations (sorting) do not affect the input slice.
	tagsCopy := append(tags[:0:0], tags...) // https://github.com/go101/go101/wiki/How-to-efficiently-clone-a-slice%3F

It looks like the reason it's hard to push down the copy + sort into toMetricTagsID is because we want the sorted copy in the metricWithTags.

In this function, tagsCopy seems like a more ambiguous name since, after toMetricTagsID, tagsCopy is actually a sorted version of tags.

I think this all gets cleaner over-all if we change toMetricTagsID to not mutate the input (see comment below) and then add the following function:

func newSortedTags(tags Tags) Tags {
	tagsCopy := append(tags[:0:0], tags...)
	sort.Sort(tagsCopy)
	return tagsCopy
}

We can use this function in registerMetric and unregisterMetric.

I think this is semantically cleaner -- in a function like registerMetric we'll store the result back to sortedTags, so it's clear what's happening (and that a copy is occurring).


metrics/registry.go, line 372 at r3 (raw file):

// generate a unique output. This implementation tries to minimize memory allocation and runtime.
func toMetricTagsID(name string, tags Tags) metricTagsID {
	// TODO(maybe): Ensure tags is already sorted when it comes in so we can remove this.

Let's tackle this TODO now and remove the sort.Sort call.

This stops the function from mutating the Tags (which I think is correct -- it's not obvious that toMetricTagsID would mutate the input argument).

We can augment the function documentation with something like:

// The ID is created by adding the tags in the order they are provided (this means that, if the caller wants a specific set of tags to always
// result in the same ID, they must sort the Tags before providing them to this function).

to make this clear.

If this function was exported/widely used, then maybe ensuring that the function does the sort would be more important for a consistent API, but given that this is a private function used in 2 places I think this is fine.

Copy link
Contributor Author

@bmoylan bmoylan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 thanks for the feedback - added newSortedTags. Let me know what you think.

Reviewable status: 1 of 2 files reviewed, 3 unresolved discussions (waiting on @nmiyake)


metrics/registry.go, line 303 at r3 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

Ah OK, I see that we're doing this in two places since we do both Register and Unregister. Don't like how the copy logic ends up getting duplicated with comment.

Done.


metrics/registry.go, line 346 at r3 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

It looks like the reason it's hard to push down the copy + sort into toMetricTagsID is because we want the sorted copy in the metricWithTags.

In this function, tagsCopy seems like a more ambiguous name since, after toMetricTagsID, tagsCopy is actually a sorted version of tags.

I think this all gets cleaner over-all if we change toMetricTagsID to not mutate the input (see comment below) and then add the following function:

func newSortedTags(tags Tags) Tags {
	tagsCopy := append(tags[:0:0], tags...)
	sort.Sort(tagsCopy)
	return tagsCopy
}

We can use this function in registerMetric and unregisterMetric.

I think this is semantically cleaner -- in a function like registerMetric we'll store the result back to sortedTags, so it's clear what's happening (and that a copy is occurring).

Done.


metrics/registry.go, line 372 at r3 (raw file):

Previously, nmiyake (Nick Miyake) wrote…

Let's tackle this TODO now and remove the sort.Sort call.

This stops the function from mutating the Tags (which I think is correct -- it's not obvious that toMetricTagsID would mutate the input argument).

We can augment the function documentation with something like:

// The ID is created by adding the tags in the order they are provided (this means that, if the caller wants a specific set of tags to always
// result in the same ID, they must sort the Tags before providing them to this function).

to make this clear.

If this function was exported/widely used, then maybe ensuring that the function does the sort would be more important for a consistent API, but given that this is a private function used in 2 places I think this is fine.

Done.

Copy link
Contributor

@nmiyake nmiyake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, this looks much better to me -- thanks!

Reviewed 1 of 1 files at r4.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@bmoylan bmoylan merged commit b5019bc into master Jan 25, 2019
@bmoylan bmoylan deleted the bm/copy-tags branch January 25, 2019 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants