-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: runtime/pprof: add new WithLabels* function that requires fewer allocations #33701
Comments
Change https://golang.org/cl/188499 mentions this issue: |
Just to summarize here to avoid needing to load external links, right now the API is:
This requires making a slice to pass to Labels, and that slice escapes (and is variable sized) so it must be heap allocated. The proposed interface in the CL is to add WithLabelsFromMapper(context.Context, LabelMapper), where LabelMapper is:
So you have to make a LabelMapper and then the context package calls its Len and Map methods to retrieve the labels. But if you already have the key-value pairs in your own data structure, you avoid allocating the converted slice. They still get copied into the context in some form, though, so you've cut the allocations by at most 50%, not 100%. Is there a simpler or cleaner API? Is Len really necessary? |
I do not think the proposal anywhere claims to lower down the usage to no allocations. I understand that saying there is opportunity to potentially save two allocations when this does only one of them in the worst case can lead to infer that this was misleading. That was not my intention. Note that there are usually at least 3 allocations involved:
The usual case I have seen is therefore should saving at least 2 of 3 allocations since most uses do not use []string as their native representation of key value pairs that can be passed as is to pprof.Labels.
The Len allows to pre-size the internally created map to usually hold all items from parent and child context with the initial allocation as another optimisation. Profiling shows that some map growth is made in this function which seems to account for ~30% of the time in WithLabels.
|
ping to experts. Would be nice if this could be resolved before go1.14 enters the freeze period. |
I think @matloob knows the history of this API design and the plan to optimize LabelsSet and WithLabels better. My impression was - it was originally designed to support census and it was uncommon to see a large number of new tags (labels) added at once. (usually one or two additional tags added per Do call except when a server is creating the new context based on the tags from wire). Maybe the trend is changed now? The |
Ping @pjweinbgo @matloob for any thoughts. Needing to make a Do does make this more complex. |
I don't think that we need to have a variant of Do. WithLabels is meant to be a lower level interface than Do, and the WithLabelsFromMapper looks like just a more efficient replacement that an alternative to Do should use. I think it would be valid for OpenCensus to have its own variant of Do that calls WithLabelsFromMapper instead. As long as the OpenCensus variant of Do sets and unsets labels the same way Do does, code that uses pprof's Do and code that uses OpenCensus's Do should be compatible. The way I thought about Do originally was that we'd make Do available for users who weren't using census or something similar, and other libraries would provide their own Dos that set the labels on the context and called WithLabels at the same time. It looks like this change provides another more efficient alternative to WtihLabels, so it seems like it fits in fine. |
This sense of "map" - meaning apply a function to a data structure - is not one we've used much in Go to date (strings.Map is the exception). I'm also bothered by needing both Len and Map. Would it really be so bad if there was only Map? Then the argument could be a plain function instead of an interface. Right now the CL uses it as a hint to allocate a map, in which case it really doesn't matter, but if a different implementation used it to allocate a slice and then blindly filled in increasing indexes, that would be a problem. I'm also a little confused about the avoidance of LabelSet. Should we instead be looking at an alternate LabelSet constructor, like LabelSetFromFunc (or a better name)? Then the result could be passed to both Do and WithLabels. |
Any comments about trying to use an alternate LabelSet constructor instead of all new API? |
I have not gotten around to profiling or testing that yet. Putting better naming aside a possible approach is:
to keep |
ping @martisch - any profiling or testing of this alternate approach? |
The approach of adding a new helper In addition I removed the mapper closure in the new approach as that caused an allocation which would regress the List case in the number of allocations but has the downside of exposing the internal map[string]string structure. As that is not ideal I would need to first investigate if we can stack allocate and keep that implementation detail hidden with a closure again if that is important. code:
Updated cl/188499 with new code. |
@martisch, thanks for confirming that we can use an alternate LabelSet constructor instead of having to change other parts of the API. It would still be good to find a way to avoid exposing the map[string]string. In the long run I expect that internal detail might change too. Unless it is OK for the LabelSet constructor to be passed a dummy map and copy those values out. |
Since we are in freeze for go1.14 I have 6months to spend some time figuring out an interface (and potential generic compiler optimization if needed) that does not cause an additional allocation and just assumes a string type for key and value but no other implementation details. |
The OpenTelemetry Go SDK has taken a position on this topic: The Metrics API needs a LabelSet type that is both inexpensive, as stated in this issue, but also supports an It would be great if the |
In the above library, @martisch, we take the position that a stable sort, followed by de-duplication, is cheaper than using a map in this situation. |
Referring to @rsc's API proposal above (#33701 (comment)), I believe that it's slightly better to support an interface like in the OTel
This my preference because often to pass a |
Thanks for the suggestions. I initially chose the Map approach since some census frameworks use maps for census labels. Since maps have no native operations to get the i th element and no stable iteration order the Get approach for maps needs them to extract all keys+values and buffer them. Maybe that easy enough to do with a wrapper (but it will cause allocation(s)). If that is as good as the map approach this might be workable and as good in performance if we teach the compiler to potentially optimize extracting all keys and values in a loop. Ideally (unless there are very good reasons) we need to keep LabelSet a struct to not break backwards compatibility. We can add a new field that is an interface. P.S. I dont see a proposal of Len and Get in the comments (but tried it in an initial prototype). |
I was referring to If the type has to be a struct, I'd make it a |
runtime/pprof.Labels
is used in conjunction withruntime/pprof.WithLabels
to set pprof labels in a context for performance profiling.go/src/runtime/pprof/label.go
Line 59 in c485506
Adding information for fine grained on demand profiling of already running binaries should idealy be very efficient so it can always stay enabled with minimal overhead. The current API could be made more efficient by requiring fewer heap allocations. Pprof information sourced from contexts added by census frameworks is used in large go deployments on every RPC request and thereby small performance gains add up to a larger resource saving across many servers.
The current
runtime/pprof
API requires census frameworks such as OpenCensus to first convert their internal representation of key and value tag pairs (in a slice or map) to a slice of strings for input toruntime/pprof.Labels
.https://github.com/census-instrumentation/opencensus-go/blob/df6e2001952312404b06f5f6f03fcb4aec1648e5/tag/profile_19.go#L24
This requires at least one heap allocation for a variable amount of labels. Then internaly the
Labels
functions constructs aLabelSet
data structure which requires another allocation (the case where this uses more than one allocation will be improved with cl/181517 ). All in all this makes two heap allocations per context creation with pprof labels which can potentially be avoided.I propose to extend
runtime/pprof
to have an API that takes e.g. a mapping/iteration interface such that census frameworks can implement that interface on their internal tag representations (e.g. maps and slices with custom types) andruntime/pprof
can then source the labels to be set in a newruntime/pprof.WithLabels*
function without first requiring conversion between multiple internal and external data structures.cl/188499 is a quick prototype as an example how this could look like. Different other ways of making an interface that can be used are possible to reduce allocations. Note that the LabelSet struct cant be changed to an interface itself (which seems the cleaner approach) due being not API backwards compatible.
/cc @aclements @randall77 @matloob
The text was updated successfully, but these errors were encountered: