-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Telegraf should do some simple metric aggregation/rollup #380
Comments
This is an interesting idea, do you have any ideas for how these aggregation functions could be configured? It would probably need to be a separate
This would then need to be processed after collection. It'll be a little tricky I think because these measurements will need to be gathered, but then dropped before they get flushed (but flushed as part of the aggregate). Another option could be putting the aggregate config as part of each plugin config, maybe something like this:
|
BTW, @ekini which plugin is generating that many metrics? |
The one that parses logs :) I've been thinking about it a bit, and I'm still not sure how to configure aggregation. But there should be some grouping, by time and tags. |
Seems like aggregations could be their own special type of plugin. They could live in their own directory and have an interface to make it easy for contributors. Mechanically, I'm thinking they would need to be run by the Doing it this way would support the former of the two config options I listed above. |
Actually we can aggregate stats as they arrive here: https://github.com/influxdb/telegraf/blob/master/agent.go#L397-L399 that way not needing to deal with dropping metrics that shouldn't be flushed on their own, we can just add the aggregated stats directly to the slice of points. |
I need to sum bytes + duration to aggregate netflow stats. Looking at your statsd plugin it doesn't appear to perform a sum. Can this be added similar to etsy/statsd? |
@erowan please open a separate feature request for the statsd input if you have one. Although I'm not 100% sure I understand what you mean. The statsd protocol sums only if you are sending counters, doesn't it? Or are you talking about performing a sum on histogram/timer metrics? Can you link to some documentation on that if it exists in the etsy implementation? |
Hello @sparrc, it's documented here https://github.com/etsy/statsd/blob/master/docs/metric_types.md But I think I am going to write (bytes*8)/duration = bps directly as a timing metric to telegraph statsd now. |
@erowan do you mean timing sums? https://github.com/etsy/statsd/blob/master/docs/metric_types.md#timing can you open a separate feature request for that? |
@sparrc yes that was what I was referring too. I am still pondering on it. I'll gladly open later if required. |
Can I work on aggregation ? it just a matter of moving code around since I have working version but it inside one of the input plug in @sparrc ? |
You can open a PR but I can't guarantee I'll accept it. This is a difficult problem and many of the stats require storing large amounts of data to be completely accurate. If you can please try to use the statsd running_stats code for these as well: https://github.com/influxdata/telegraf/blob/master/plugins/inputs/statsd/running_stats.go I'd prefer that over using an outside library. Currently running_stats doesn't have a median or sum function, but that should be simple to add. |
Here is a PR which addresz the issue #1364 |
We're also looking for a way to do aggregations sum within telegraph before the data is sent over to Influx as our volume can be 100k(s) updates per second. |
An ideal solution for me is if the logparser plugin (#1320) supported aggregates in the way statsD works. |
@jadbox If you mean by aggregation sum of each field this can be added easily to histogram aggregation filter. I don't think it is the right to have aggregation within input it because it really hard to apply it on other input plugins . |
@alimousazy
and I want to aggregate in telegraph before sending to Influx:
These are the aggregate 1s slices I need to send directly to Influx. I'm not seeing how histogram solves this- can you explain it more? Note that I do not know the userID field values ahead of time... they are arbitrary data points. |
@jadbox Could you please tell me if Joe and Terry are tag names or metric names ? if it is a tag name then aggregation will be per tag so you will have two metrics with same metric name but different tags aggregated per tag name already supported with current implementation (The result that you want). I will all add "_ALL" as reserved metric name which allow aggregation all metrics regardless of the name but that doesn't matter in your case.by the way LogParser will emit all the metric under one metric name but I think with different tags, so you will the expected result. |
@sparrc fyi, in my case I need aggregations before I send data to a DB. (400k/s writes) @alimousazy Joe/Terry are tag names. The metric name would be a single static name as the data falls into a single category. Okay, you're saying that this is supported with LogParser, but how do I tell LogParser to increment certain fields together by tag name, by 1 minute sliced batches? I don't see anything related to aggregations (either by tag or by time slice) in the docs: https://github.com/influxdata/telegraf/tree/master/plugins/inputs/logparser |
it is not supported by logparser, there is currently no support for this except using the statsd input. The solution for this will need to be generic and usable across all plugins, as well as supporting filtering of tag key/values, field names, and measurement names. |
@jadbox You don't have to add anything to logpaser config, you just to enable histogram filter by adding this configuration (You can enable the filter to any kind of plugins)
*Note: you can tone aggregation interval by modifying flush_interval (I may change flush interval to aggregation interval) , If you don't need percentile just leave the array empty. Note this code is not merged yet so you have to merge it your self and build from source. expect changes after code review . Once you feel that the code solve your case I will add sum |
@alimousazy Okay, I think adding sum to histogram may work for me. I don't need the percentile so my config would look like this I assume.
Might be useful to optionally specify to just export sum (when it has been added) instead of always including variance, mean, and count along with it. This may save a good chunk of performance when dealing with high volume of data. Of course, this breaks the notion of the filter plugin being a histogram versus just an aggregator. |
@jadbox, I just Added support for sum to the pull request. Don't worry about performance I'm using special implementation for Histogram which specially designed for streaming and low memory foot print, please let me know about any feedback. I will spend tonight in testing solidifying the solution. |
I recently added an issue for InfluxDB to be able to do aggregations across many measurements. It would be good if the Telegraf method for doing this used a similar sort of structure and syntax. See influxdata/influxdb#6910 |
@pauldix I can map the syntax to something like this
While I feel adding the fields condition have a big cost since we are dealing with streaming data. Any other ideas for filter which reside between input and output plugins , I have the following filter that I might implement in the future if infrastructure get merged : 1- Rename filter for renaming tags or metric (Metric shaping). Any ideas on these filters syntax ( I might with other ideas in the future ) ? |
I just added support for : Example :
For more information check the pull request #1364 |
closing this in favor of #1662 |
Let's say I have 1k metrics per second, generated by one host, with the same tags, but different values.
I want to sum all values, aggregated by 1 minute.
I can send all of them to InfluxDB and do aggregation there. It works for a few hosts, but what if I have thousands of them? InfluxDB will just die.
I'm not speaking about complex functions, but some simple ones like sum(), count() and mean() would be nice to have.
The text was updated successfully, but these errors were encountered: