Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top k processor plugin #3400

Closed
wants to merge 440 commits into from
Closed

Top k processor plugin #3400

wants to merge 440 commits into from

Conversation

mirath
Copy link
Contributor

@mirath mirath commented Oct 27, 2017

This processor objective is to provide the possibility to pick the top k metrics from a set of metrics. It aims to solve #3192

The pluging uses:

Selectors to pick the metrics that will be filtered
A group by clause to group the metrics based on metric names and tags
A list of fields
An aggregator function to agreggate each field of the grouped metrics
Once the aggregations for each group are calculated, the plugin orders the groups by each aggregation and returns all the metrics of each group in the top k spots

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

@mirath
Copy link
Contributor Author

mirath commented Oct 27, 2017

This PR is not ready yet. I'm creating a PR allow feedback from the mantainers now that the plugin is more or less stable in design and structure.Some things that are missing are: unit test, more aggregation functions, more extensive testing

@mirath mirath changed the title Top k Top k processor plugin Oct 27, 2017
Copy link
Contributor

@danielnelson danielnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mostly just looked at the README, don't forget we are using Go naming conventions: uglyCamelCase vs the_one_true_way.

[[processors.topk]]
period = 10 # How many seconds between aggregations. Default: 10
k = 10 # How many top metrics to return. Default: 10
metric = "mymetric" # Which metrics to consume. Supports regular expressions. No default. Mandatory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to check the code on this, but it looks that the normal measurement filtering options work for processors, although perhaps in a somewhat non-intuitive way.

If a metric is filtered out, the processor is bypassed but the metric continues to the outputs[1]. This means that you could use these for selecting the metrics here instead of custom params. I'm not sure if this is something that should be changed or if the documentation just needs improved.

[1] https://github.com/influxdata/telegraf/blob/master/internal/models/running_processor.go#L39

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the documentation could be improved. Looking around the only documentation regarding processors that I could find was this. I think I saw the measurements filtering docs before, but I didn't connect them to the processors for some reason, that was my bad.

However, I think that the main docs in docs.influxdata.com should be updated. I just looked around and there isn't a section dedicated to processor configuration, at least not one that is easy to spot. At the very least a link to the measurement filtering should be added IMO.

I'll simply the code so it doesn't use these custom options, since now I now that they are just clutter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the change I made to the docs, 777b84d, I hope this will connect the measurement filtering to the plugins more directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the changes to the docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that would be nice are examples of the use of the metrics filtering options groups under the metrics filtering options section. There are enough examples of their use along CONFIGURATION.md, but since they are spread out consider centralizing them if you are interested on making the docs extra friendly.

fields = ["memory_rss"] # Over which fields are the top k are calculated. Default: "value"
aggregation = "avg" # What aggregation to use. Default: "avg". Options: sum, avg, min, max
group_by = ["process_name"] # Over which tags should the aggregation be done. Default: []
group_by_metric_name = false # Wheter or not to also group by metric name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it work if we always group by series key? The series key is measurment,tagkey=tagvalue,tagkey2=tagvalue2 and you can use metric.HashID() to get an opaque series identifier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow you here. It will work if you always group by metric name and all the tags. Is that what you want to know?

From what I understand metric.HashID() returns some hash over the series name and the tags. That is useful, but for this use case, it would prevent me from grouping over only tags, or only over a subset of tags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of when grouping over a subset of tags is useful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have metrics of CPU from several datacenters, but want to group only by process name, so as to get a picture of the most intensive processes across all data-centers.

A more concrete example, the procstat module puts the process_name in a tag, but you can optinally put the PID in a tag. If you put the PID in a tag, grouping across all tags will agreggate invocations of processes, which might not be what you need as it fails to agreggate across process restarts.

Copy link
Contributor Author

@mirath mirath Oct 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a more general note, I think grouping across a subset of tags gives the plugin much more expressive power that has an excellent potential of addressing a lot of potential needs from Telegraf users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mirath
Copy link
Contributor Author

mirath commented Oct 30, 2017

Regarding UglyCamelCase vs the_one_true_way (though I'm really netutral on that subject), I based my style from the procstat module. They use the_one_true_way for variables and some functions names, but UglyCamelCase for interface names and interface members.

So I'm not sure if you mentioned that just as a general tip, or if you spotted some things that I should change to the_one_true_way?

@danielnelson
Copy link
Contributor

I'll work on improving the processor documentation today. On style, the code needs to be camelCase and the created metrics should be snake_case.

danielnelson and others added 24 commits December 13, 2017 17:51
This method is reported to not work with IAM Instance Profiles, and we
do not want to make any calls that would require additional permissions.
Germán Jaber added 25 commits January 17, 2018 18:51
@mirath
Copy link
Contributor Author

mirath commented Jan 18, 2018

I've made a mess out of this branch, many rebases later I don't know what is going on, so I'm just going to create a new branch and create a PR for that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.