-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Top k processor #3691
Top k processor #3691
Conversation
@danielnelson Here is the new PR. Much better XD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks good, here are a few comments and ideas I had:
plugins/processors/topk/topk.go
Outdated
return topk | ||
} | ||
|
||
func NewTopKProcessor() telegraf.Processor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't need this if you return a *TopK
from NewTopK
, since it is a telegraf.Processor by virtue of the functions begin defined. Since you just have a single new function, you should probably call it New
, to match Go convention.
plugins/processors/topk/topk.go
Outdated
} | ||
|
||
var sampleConfig = ` | ||
[[processors.topk]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switch this to match commenting style of other plugins with the comments above the options. Options with a default value that do not need to be set should be commented out, but make sure it works if they are unset which can be hard to detect due to "zero values", use ##
for the comment lines and wrap them at 78 chars.
example:
[[processors.topk]]
## How many seconds between aggregations. Default: 10
# period = 10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielnelson I've corrected this and several other issues. I've not pushed because in fixing this issue I changed lines tied to several open discussions, and I don't want them to show as updated when they are not yet resolved. When we come to a conclusion regarding the outstanding comments I will push my latest changes.
plugins/processors/topk/topk.go
Outdated
field string | ||
} | ||
|
||
func (ags Aggregations) Len() int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a pointer receiver for this type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't. The sort interface asks for a direct value:
plugins/processors/topk/topk.go:165: cannot use aggs (type Aggregations) as type sort.Interface in argument to sort.Sort:
Aggregations does not implement sort.Interface (Len method has pointer receiver)
plugins/processors/topk/topk.go:167: cannot use aggs (type Aggregations) as type sort.Interface in argument to sort.Reverse:
Aggregations does not implement sort.Interface (Len method has pointer receiver)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, BTW I have been heard from my colleagues that it is now preferred to use sort.SliceStable
with a less func instead of defining ordering on the type.
plugins/processors/topk/topk.go
Outdated
aggregation = "avg" # What aggregation to use. Default: "avg". Options: sum, avg, min, max | ||
|
||
bottomk = false # Instead of the top k largest metrics, return the bottom k lowest metrics. Default: false | ||
simple_topk = false # If true, this will override any GroupBy options and assign each metric its own individual group. Default: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we need either group_by
to be non-empty or simple_topk
to be true, else we have nothing to group by. What if we treated an empty group_by
as enable simple_topk
and remove it as an option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should treat an empty group_by
to aggregate over all the tags of the metric. Speaking of which, we could allow regexes in the group_by
list, so users with dynamically generated tags can make better use of the processor.
plugins/processors/topk/topk.go
Outdated
# this setting) to each metric with the value of the calculated GroupBy tag. | ||
# Useful for debugging. Default: "" | ||
|
||
position_field = "" # This settings provides a way to know the position of each metric in the top k. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be called rank
and the generated field would be something like: memory_rss_rank
.
plugins/processors/topk/topk.go
Outdated
} | ||
|
||
var sampleConfig = ` | ||
[[processors.topk]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the sampleConfig only, remove the [[processors.topk]]
line, this gets added by Telegraf when generating the default config, test with telegraf config | less
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solved
plugins/processors/topk/topk.go
Outdated
|
||
# Metrics are grouped based on their tags and name. The plugin aggregates the selected fields of | ||
# these groups of metrics and sorts the groups based these aggregations | ||
group_by = ["process_name"] # Over which tags should the aggregation be done. Default: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would someone want to specify less than the full set of tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have metrics with different tag sets that you want to aggregate together. For example, I want the top 10 processes by RSS size across all the instances in the cluster0.
Are you thinking that there should be a way to say "I want to aggregate across all tags" without having to specify tag by tag? That would certainly help in environments with dynamic tags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and aggregate across all tags should probably be the default, I would do what you mentioned with empty list matching all. Use the filter.Filter
type which support globs (not regex), which will make it behave like the measurement filtering options.
return []telegraf.Metric{} | ||
} | ||
|
||
func min(a, b int) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see where this function is called, remove it if you can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's called in line 221
// Create a one dimentional list with the top K metrics of each key
for i, ag := range aggregations[0:min(t.K, len(aggregations))] {
plugins/processors/topk/topk.go
Outdated
# The plugin returns a metric if it is in a group in the top k groups ordered by any of the aggregations of the selected fields | ||
# This effectively means that more than K metrics may be returned. If you need to return only the top k metrics regardless of grouping, use the simple_topk setting | ||
fields = ["memory_rss"] # Over which fields are the top k are calculated. Default: ["value"] | ||
aggregation = "avg" # What aggregation to use. Default: "avg". Options: sum, avg, min, max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call avg
-> mean
to be consistent with basicstats aggregator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solved
plugins/processors/topk/topk.go
Outdated
# The plugin returns a metric if it is in a group in the top k groups ordered by any of the aggregations of the selected fields | ||
# This effectively means that more than K metrics may be returned. If you need to return only the top k metrics regardless of grouping, use the simple_topk setting | ||
fields = ["memory_rss"] # Over which fields are the top k are calculated. Default: ["value"] | ||
aggregation = "avg" # What aggregation to use. Default: "avg". Options: sum, avg, min, max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need to have support for different aggregations per field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a very good question. I thought about it and decided that someone down the line could contribute such a feature. Do you think is a valuable enough capability to delay the release and work on it before merging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can add this in later.
@danielnelson I finished addressing the issues you brought up. I also made Let me know if you have any more comments, or if we are ready to merge (and if the plugin would be available for Telegraf 1.6) |
I'll try to take a look as soon as I can, but this will have to be a 1.7 addition, sorry. |
No worries, thx |
This keeps the format of this file the same between systemd and sysvinit.
To match existing metric style.
…ists in the test routines matters
…nHash field of the metricChange struct
…its parent, instead of logging it
…regation function
…esary computations
So I messed up the rebase again X.X I'm closing this PR and opening a new one. |
This processor objective is to provide the possibility to pick the top k metrics from a set of metrics. It aims to solve #3192. It also solves #3562, as that issue was preventing the test to run succesfully
The pluging uses:
PR #3400 closed in favor of this one, as I mangled the branch through several irresponsible rebases
Required for all PRs: