Top k processor plugin #3400

mirath · 2017-10-27T15:58:53Z

This processor objective is to provide the possibility to pick the top k metrics from a set of metrics. It aims to solve #3192

The pluging uses:

Selectors to pick the metrics that will be filtered
A group by clause to group the metrics based on metric names and tags
A list of fields
An aggregator function to agreggate each field of the grouped metrics
Once the aggregations for each group are calculated, the plugin orders the groups by each aggregation and returns all the metrics of each group in the top k spots

Required for all PRs:

Signed CLA.
Associated README.md updated.
Has appropriate unit tests.

mirath · 2017-10-27T15:59:26Z

This PR is not ready yet. I'm creating a PR allow feedback from the mantainers now that the plugin is more or less stable in design and structure.Some things that are missing are: unit test, more aggregation functions, more extensive testing

danielnelson

I've mostly just looked at the README, don't forget we are using Go naming conventions: uglyCamelCase vs the_one_true_way.

danielnelson · 2017-10-27T21:06:05Z

plugins/processors/topk/README.md

+[[processors.topk]]
+  period = 10                  # How many seconds between aggregations. Default: 10
+  k = 10                       # How many top metrics to return. Default: 10
+  metric = "mymetric"          # Which metrics to consume. Supports regular expressions. No default. Mandatory


I needed to check the code on this, but it looks that the normal measurement filtering options work for processors, although perhaps in a somewhat non-intuitive way.

If a metric is filtered out, the processor is bypassed but the metric continues to the outputs[1]. This means that you could use these for selecting the metrics here instead of custom params. I'm not sure if this is something that should be changed or if the documentation just needs improved.

[1] https://github.com/influxdata/telegraf/blob/master/internal/models/running_processor.go#L39

I think the documentation could be improved. Looking around the only documentation regarding processors that I could find was this. I think I saw the measurements filtering docs before, but I didn't connect them to the processors for some reason, that was my bad.

However, I think that the main docs in docs.influxdata.com should be updated. I just looked around and there isn't a section dedicated to processor configuration, at least not one that is easy to spot. At the very least a link to the measurement filtering should be added IMO.

I'll simply the code so it doesn't use these custom options, since now I now that they are just clutter.

Here is the change I made to the docs, 777b84d, I hope this will connect the measurement filtering to the plugins more directly.

I like the changes to the docs

One thing that would be nice are examples of the use of the metrics filtering options groups under the metrics filtering options section. There are enough examples of their use along CONFIGURATION.md, but since they are spread out consider centralizing them if you are interested on making the docs extra friendly.

danielnelson · 2017-10-27T21:11:32Z

plugins/processors/topk/README.md

+  fields = ["memory_rss"]      # Over which fields are the top k are calculated. Default: "value"
+  aggregation = "avg"          # What aggregation to use. Default: "avg". Options: sum, avg, min, max
+  group_by = ["process_name"]  # Over which tags should the aggregation be done. Default: []
+  group_by_metric_name = false # Wheter or not to also group by metric name


Will it work if we always group by series key? The series key is measurment,tagkey=tagvalue,tagkey2=tagvalue2 and you can use metric.HashID() to get an opaque series identifier.

I don't quite follow you here. It will work if you always group by metric name and all the tags. Is that what you want to know?

From what I understand metric.HashID() returns some hash over the series name and the tags. That is useful, but for this use case, it would prevent me from grouping over only tags, or only over a subset of tags.

Can you give an example of when grouping over a subset of tags is useful?

You could have metrics of CPU from several datacenters, but want to group only by process name, so as to get a picture of the most intensive processes across all data-centers.

A more concrete example, the procstat module puts the process_name in a tag, but you can optinally put the PID in a tag. If you put the PID in a tag, grouping across all tags will agreggate invocations of processes, which might not be what you need as it fails to agreggate across process restarts.

In a more general note, I think grouping across a subset of tags gives the plugin much more expressive power that has an excellent potential of addressing a lot of potential needs from Telegraf users.

mirath · 2017-10-30T03:45:25Z

Regarding UglyCamelCase vs the_one_true_way (though I'm really netutral on that subject), I based my style from the procstat module. They use the_one_true_way for variables and some functions names, but UglyCamelCase for interface names and interface members.

So I'm not sure if you mentioned that just as a general tip, or if you spotted some things that I should change to the_one_true_way?

danielnelson · 2017-10-30T18:55:23Z

I'll work on improving the processor documentation today. On style, the code needs to be camelCase and the created metrics should be snake_case.

This method is reported to not work with IAM Instance Profiles, and we do not want to make any calls that would require additional permissions.

…uxdata#3580)

…al key used to group metrics

…g an a simple topk

… the Split method

This reverts commit c0bf027, reversing changes made to 1644476.

mirath · 2018-01-18T01:22:33Z

I've made a mess out of this branch, many rebases later I don't know what is going on, so I'm just going to create a new branch and create a PR for that

mirath changed the title ~~Top k~~ Top k processor plugin Oct 27, 2017

danielnelson reviewed Oct 27, 2017

View reviewed changes

mirath mentioned this pull request Dec 13, 2017

The Copy() method of Metric does not copy the metric type nor the aggregate boolean #3562

Closed

danielnelson and others added 24 commits December 13, 2017 17:51

Remove AWS credential check from cloudwatch output (influxdata#3583)

5b40173

This method is reported to not work with IAM Instance Profiles, and we do not want to make any calls that would require additional permissions.

Set release date for 1.5.0

d6fd9ce

Update haproxy documentation

4537eb2

Improve bond plugin description (influxdata#3588)

b90ee4a

Update bond input description

4f1ea13

Add control over which stats to gather in basicstats aggregator (infl…

fcc9c82

…uxdata#3580)

Update changelog

3029d58

TopK processor: Add two more general tests for GroupBy

bf61467

Topk processor: Add first test for the Fieds setting

a37347d

Add messages_delivered_get to rabbitmq_overview (influxdata#3596)

4964521

Update changelog

801a248

Topk processor: Add first version of the missing tests

7d4f339

Topk processor: Fix Groupby + Fields tests

7887430

Update TODO

ed0b68b

Topk processor: Update TODO

4ddc8d8

Topk processor: Fix Groupby Metric name tests

6e22cfe

Topk processor: Fix errors setting the tests groups

ebc5ffd

Topk processor: Honor DropNonBottom setting

a482fff

Topk processor: Fix error in the DropNonBottom test

dfbba6c

Topk processor: Fix error in the position field routine

13b1a2d

Topk processor: Fix bug in the Bottomk test

e48aae7

Topk processor: fix error in the Bottomk test

d4da237

Topk processor: Fix NonDrop test

84f6123

Topk processor: Add setting that allows to append as a tag the intern…

bf82251

…al key used to group metrics

Germán Jaber added 25 commits January 17, 2018 18:51

Update TODO

6a85c9f

Topk processor: Update TODO

b88b227

Topk processor: Fix Groupby Metric name tests

49207da

Topk processor: Fix errors setting the tests groups

93262e3

Topk processor: Honor DropNonBottom setting

476bee1

Topk processor: Fix error in the DropNonBottom test

a5796d2

Topk processor: Fix error in the position field routine

688a536

Topk processor: Fix bug in the Bottomk test

addafd0

Topk processor: fix error in the Bottomk test

383760b

Topk processor: Fix NonDrop test

4fab046

Topk processor: Add setting that allows to append as a tag the intern…

bb304d9

…al key used to group metrics

Topk processor: Add first versions of tests for the GroupByTag settin…

95763cc

…g an a simple topk

Topk processor: Fix test for GroupByTag setting

04e5fe0

Topk processor: Update TODO

b70f3fa

Topk processor: Add setting to perform a simple topk (withoug grouping)

ff173b3

Topk processor: Fix a comment and update the TODO

1d4ac8a

Topk processor: Fix code casing

414ba86

Topk processor: Update documentation

cce56c6

Topk processor: Fix syntax error

bc4e2a9

Topk processor: Fix source code formatting

d61ad21

Topk processor: Fix bug in the Metric's Copy method that was breaking…

2e8c90b

… the Split method

Merge branch 'TopK' of github.com:mirath/telegraf into TopK

c0bf027

Revert "Merge branch 'TopK' of github.com:mirath/telegraf into TopK"

a6701fa

This reverts commit c0bf027, reversing changes made to 1644476.

Revert "Merge branch 'TopK' of github.com:mirath/telegraf into TopK"

1fabeb4

This reverts commit c0bf027, reversing changes made to 1644476.

Merge branch 'TopK' of github.com:mirath/telegraf into TopK

1292839

mirath closed this Jan 18, 2018

mirath deleted the TopK branch January 18, 2018 01:55

mirath mentioned this pull request Jan 18, 2018

Top k processor #3691

Closed

3 tasks

russorat mentioned this pull request Jun 27, 2018

A way to monitor the top n processes by cpu, memory or IO would be nice #3192

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Top k processor plugin #3400

Top k processor plugin #3400

mirath commented Oct 27, 2017 •

edited

Loading

mirath commented Oct 27, 2017

danielnelson left a comment

danielnelson Oct 27, 2017

mirath Oct 30, 2017

danielnelson Oct 30, 2017

mirath Nov 1, 2017

mirath Nov 1, 2017

danielnelson Oct 27, 2017

mirath Oct 30, 2017

danielnelson Oct 30, 2017

mirath Oct 30, 2017

mirath Oct 30, 2017 •

edited

Loading

danielnelson Oct 30, 2017

mirath commented Oct 30, 2017

danielnelson commented Oct 30, 2017

mirath commented Jan 18, 2018

Top k processor plugin #3400

Top k processor plugin #3400

Conversation

mirath commented Oct 27, 2017 • edited Loading

Required for all PRs:

mirath commented Oct 27, 2017

danielnelson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mirath Oct 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mirath commented Oct 30, 2017

danielnelson commented Oct 30, 2017

mirath commented Jan 18, 2018

mirath commented Oct 27, 2017 •

edited

Loading

mirath Oct 30, 2017 •

edited

Loading