Output buffer persistence #802

kostasb · 2016-03-07T10:38:36Z

In order to avoid dropping data from the output buffer because of telegraf service restart / extended connectivity loss with consumers / any other unexpected incident, there should be an option to enable persistence of the output buffer on disk.

Enabling such a feature will introduce I/O dependencies for Telegraf, so it should be optional and most probably disabled by default. Persistence should be enabled on a per output plugin basis, depending on whether dropping data is critical or not.

Proposed config file sample:

[agent]
max_buffer_limit = 1000

[[outputs.influxdb]]
...
persist_buffer = true

[[outputs.graphite]]
...
persist_buffer = false

@sparrc thoughts?

joezhoujinjing · 2016-11-22T18:31:30Z

Is there a plan for this feature? @sparrc

sparrc · 2016-11-22T22:16:39Z

nope, sorry, it will be assigned a milestone when there is

biker73 · 2017-01-09T10:25:55Z

I use Kafka for this and then use telegraf to read / write to it. Kafka is great as a store to persist data to and making that available for others and also to set custom retention policies on topics. As kafka is 'free' under the Apache License why re-write another excellent solution already exists. Telegraf supports both input and output to Kafka and Kafka is a very versatile / scaleable product for this kind of purpose .

bfgoodrich · 2017-05-24T12:48:26Z

Kafka still doesn't solve the problem when Kafka itself becomes an issue. There should be some way to enable on-disk persistence (with some limit) so that data isn't lost in the event that an output becomes temporarily unavailable.

kt97679 · 2018-04-06T19:23:00Z

+1 for persistent buffers, this is a very useful feature.

voiprodrigo · 2018-04-13T13:36:15Z

Elastic just added this capability to Beats. elastic/beats#6581

Just noting here as maybe parts of their implementation could be useful.

voiprodrigo · 2018-09-28T02:14:36Z

Anything planned for this?

Jaeyo · 2018-09-28T02:26:48Z

+1

voiprodrigo · 2019-02-06T23:19:41Z

Maybe for 2.0? :)

danielnelson · 2019-02-06T23:27:23Z

Maybe, this is not high priority right now and requires a substantial amount of design work. One aspect that has changed is that Telegraf will now only acknowledge messages from a queue after it they have been processed (sent from all outputs or filtered), it should be possible to use a queue to transfer messages durably with Telegraf.

PWSys · 2020-01-21T18:16:27Z

Any suggestions on picking a simple single instance message queue?

markusr · 2020-01-23T17:15:16Z

@PWSys: I shortly did some tests with the following setup:
Data --> telegraf --> RabbitMQ --> telegraf --> influxdb
using the AMQP input and output plugin.

It worked but I decided not use it because it adds too much complexity.
Since all your data is stored in RabbitMQ you need to configure and operate it properly. This was quite a challenge for me since I never used RabbitMQ before. Maybe you have more experience with it.

See RabbitMQ config and persistence.

PWSys · 2020-01-27T05:32:13Z

@markusr Thanks for the info!

I was also looking at this, but instead with a single instance of Kafka. It can be deployed fairly simply as a container, but like you, I question the complexity. Ultimately, whether or not it will decrease the system resiliency.

darinfisher · 2020-07-28T17:16:49Z

Taken care of by #4938

ssoroka · 2020-07-28T19:17:50Z

Hey, I'm going to reopen this because I don't think #4938 addresses this issue. Slow outputs cause metrics to be dropped without blocking inputs. This ticket is asking for metric durability for outputs.

This request isn't unreasonable, it just hasn't been a high priority. It might be helpful to take a minute to summarize my thoughts on this, some of the concerns around how to address it, and what should be kept in mind when addressing it. I guess you all could use an update after 4 years.

Telegraf currently tries its best to balance keeping up with the metric flow and storing metrics that haven't been written to an output yet, up to the metric_buffer_limit; past that limit, we intentionally drop metrics to avoid OOM scenarios. At some point, it's reasonable to think Telegraf will never catch up and it should just do its best from that point on.

A review of the concerns at play:

backpressure: slow outputs should slow down inputs so we don't unnecessarily ingest too much and then drop it
slow consumers: slow outputs should not be allowed to block faster outputs. If they fall too far behind they should start discarding old (less relevant) metrics.
out of order: metrics should generally be delivered in order, but this becomes either impossible or undesirable once you fall too far behind (the most relevant data is the latest).
user preference: some Telegraf users have preferences for whether they want to prioritize durability or performance, and whether they're willing to sacrifice metric order to do so.
message-tracking: Telegraf currently uses a feature called "metric tracking" internally for scenarios where certain inputs (like message queues) want some sort of durability; We don't ACK the message until it's written to the output. This also doubles as a back-pressure feature, as you can limit the number of un-ACKed metrics. Not all inputs have this turned on.

It's not entirely easy to weave durability into that. There's a few potential options for what to implement:

best-scenario "durability": on shutdown Telegraf saves the output buffers to disk before quitting. This isn't real durability, but it might be what some users want.
real output durability: Telegraf writes all output-buffered messages to disk (but no durability for non-output messages). One could imagine non-trivial cost and non-trivial implementation here.
real full-telegraf-stack durability: Telegraf writes all incoming messages to disk, all transformations to disk, and only removes them after it's sure they're accepted downstream, forcing backpressure everywhere to ensure it doesn't over-consume metrics in flight.

This issue describes # 2. I don't think # 3 is generally all that useful for metric data, and I can't help thinking that # 1 will cause more problems than it solves.

mrdavidaylward · 2020-08-12T05:15:52Z

A buffer to disk much like persistent Que's on logstash would be great. I run an ISP and when a tower goes down I heavily rely on backfill on my Backbone dishes to save the day the issue is when the downtime is too long I run out of memory or lose metrics.

I think When the metric Buffer in memory is full there should be a disk metric buffer option and only after the in-memory buffer is full then it starts writing to disk overflow to disk, I think having this intelligence on writing to memory till that buffer is hit can help avoid disk-related slowdowns or thrashing of emmc in the case of my setup.

Looking back in the thread this does look like a feature people are looking for.

ssoroka · 2020-08-13T15:36:41Z

I think there's a balance that could be struck here: best-effort storing of metrics that don't fit in the buffer, maybe with some kind of modified tail to read in the records from disk. inputs.tail has backpressure built into it, so it will naturally not get ahead of itself (it will avoid consuming too much and avoid dropping metrics).

based on that, a potential solution could be:

add agent.metric_buffer_overflow_strategy and default it to drop (current behavior)
support agent.metric_buffer_overflow_strategy = "disk"
- (Future support for "backpressure" could be interesting)
- "disk" would write metrics that don't fit in the buffer to disk
- support agent.metric_buffer_overflow_path = "/some/filesystem/path"
add support to inputs.tail for renaming/deleting/moving files after Telegraf has processed them
- the only snag here is how to figure out that the file is finished being written to? Last write timestamp is older than x, where x is configurable?
- require the use of inputs.tail to process the backlog. Could maybe do this transparently based on agent config?

Will think this over and run it past the team.

russorat · 2020-08-13T17:33:16Z

Connecting this issue: #2679
When something changes for an output that requires a config reload, maybe on SIGHUP, buffers are written to disk, then immediately processed with the new config. Maybe there is a new config option for this?

in addition to path, you might need some of the other options from the file output for limiting size and/or rotation. https://github.com/influxdata/telegraf/tree/master/plugins/outputs/file#configuration

Is the behavior to store metrics in a memory queue and "flush" those metrics to disk once the limit has been hit? Then continue filling the in memory queue again? When the connection is restored, the process is reversed until all files are processed? File(s) would be processed and removed once it is confirmed a successful response from the output.

I assume there would be one file per output plugin, similar to one buffer per output we have today? and some naming convention for duplicate output configs (two influxdb outputs for example).

rightkick · 2020-10-01T08:43:52Z

I think this is a very useful feature to avoid data loss for critical data and still keep a simple and robust data pipeline. Is there any plan to include this?

I have the following scenario: a low spec hardware appliance at the edge collecting metrics in influxdb that needs to push the data to a central server. The network connection is intermittent and the hardware appliance may be restarted. It will be good to have an option at telegraf to retain not sent data after loss of communication or restart of device. The data are critical and must be retained. Also, due to low device specs (60GB disk, 4GB RAM, 2 cpu cores) is cannot run apache kafka. One will need to investigate other options (rabbitmq or other) so as to have this option and it will be nice to avoid adding more components into the mix.

rightkick · 2020-11-16T07:40:22Z

Ended up adding rabbitMQ into the mix to ensure data persistence at Telegraf buffer.

naorw · 2020-11-24T08:24:27Z

I agree it's important specially if using external store DB such as influxcloud, local DC can have connectivity issues, why loose metrics or manage external components.

v-j-f · 2021-04-14T12:07:04Z

Hi all,

Is there a maximum limit for the metric_buffer_limit value?
In what proportion the increase of this value affects the memory consumption?

We frequently see in the telegraf log the following messages:

[outputs.influxdb] Metric buffer overflow; 2045 metrics have been dropped.

Currently we have configured:

[agent]
  metric_buffer_limit = 10000

[[outputs.influxdb]]
  metric_buffer_limit = 750000

but the metrics are still being deleted.

Environment:

Debian 10 with Telegraf 1.18.0-1 and Influxdb 1.8.4-1.
We collect VMware metrics with [[inputs.vsphere]]

Thanks.

Sorry if this is not the right thread for this.

ssoroka · 2021-04-19T16:45:25Z

Is there a maximum limit for the metric_buffer_limit value?
In what proportion the increase of this value affects the memory consumption?

There is no specific maximum, but you will eventually run out of memory before the max can be hit. for smaller metrics I assume about 1k of memory multiplied by the max number of metrics (the metric_buffer_limit) for memory use. For larger metrics you may need to use a value larger than 1k (figuring out this number exactly isn't trivial). Leave room for error so you don't see "out of memory" crashes.

Note that If you always see metric drops no matter the metric_buffer_limit this might be because you have more data throughput than the output can keep up with, outputs.influxdb can try enabling gzip compression, but in general if this is true, the resolution of this issue is not going to solve your problem; you're just going to fill your disk as well and see "disk out of space" error messages.

v-j-f · 2021-04-20T06:34:01Z

Thank you @ssoroka for the reply.

leventov · 2021-04-28T18:42:26Z

For our project, we would like to have metrics persisted in an SQLite buffer which is also available for local querying as a "micro" timeseries database. Since it's probably not in sync with the vision of project maintainers, we are going to fork Telegraf to add support for this. But I would be happy to learn otherwise.

ssoroka · 2021-05-14T16:11:50Z

@leventov rather than a fork, take a look at processors.execd, outputs.execd, and execd shim for custom plugins.

WireFr33 · 2021-11-24T15:23:34Z

With reference to some of the concerns above:
backpressure - Does TCP not address this already? Do we need to duplicate TCP flow control?
slow consumers - Same as above. TCP flow control.
out of order - Can this be addressed with timestamps in the data?

Can a persistent output buffer with a user configurable fixed size be implemented together with the following?

As data is sent and acknowledged by the remote end, it is removed from the output buffer.
If the buffer fills up to the maximum size then the oldest data is replaced with the newest data (like a ring buffer).

jhychan · 2022-04-23T19:18:42Z

Chiming in here as this is a feature that would be great to have. However, to me it sounds a lot like a problem that could be solved by write-ahead logging, which is what a few comments here have already described. It's also what prometheus/loki agents (grafana-agent/promtail) have implemented.

Would it be feasible to add a WAL implementation to Telegraf as part of the outputs? It would be an optional common feature for all output plugins - but users would only enable it for output plugins where they need the benefits of a WAL. It can serve to complement the memory buffer, or perhaps entirely replace the in-memory metric buffer. Could we not treat the WAL simply as an alternative to the memory buffer, with some performance trade-offs? Memory and disk effectively serve the same function here - storing metrics in case the a given output plugin fails to process them. The only difference is that the disk buffer is slower but can be persisted and re-processed when Telegraf starts back up.

Concerns about disk utilisation can be managed much like the in-memory buffer - limit by disk usage, or by number of metrics it can hold. And like the current metric buffer, if the output plugin fails to process the metrics it simply remains in the WAL until dropped by overflow (or manually cleared if implementation allows).

andoks · 2022-08-05T12:45:39Z

One idea might be to create a simple Write-ahead-log CLI tool (e.g by using https://pkg.go.dev/github.com/tidwall/wal or a similar library), and integrate that tool using a combination of outputs.exec(d) and inputs.exec(d) plugins to persist the values, making the CLI tool handle the pruning and truncating of the WAL.

This would however require that the inputs.exec(d) plugin and subsequent plugins apply backpressure and/or acknowledge data written.

n3k232 · 2022-11-24T12:51:11Z

Totally in need of a solution for this.

TsubasaBE · 2023-01-29T13:57:17Z

+1 for this feature.
This is still the only missing feature keeping is from moving from Elastic Beats to Telegraf entirely.

yash1234singh · 2023-02-09T20:39:48Z

+1 for this feature. Lack of this is forcing me to fluent-bit

pratikdas44 · 2023-02-21T09:00:01Z

+1 having a similar request. Due to context Deadline issue, sometime metrics don't go until we restart the container. And since Kafka Plugin supports reading only from oldest or newest offset, recovery is taking time. I had asked the same question to chatgpt -

Wanted to know, if those feature were there earlier and removed now, because I tested using telegraf as expected it was showing an error. Any does anyone has come up with the same problem, and how they have solved that?

WireFr33 · 2023-02-21T13:12:55Z

Looks like ChatGPT is creating features that do not exist.

metric_buffer_full_trigger, buffer_path and "dead letter queue" not seem to exist in telegraf.
An AI bot that writes this code would be great.

vlcinsky · 2023-06-01T16:14:17Z

I have the same need and our current solution is using local instance of InfluxDB OSS with bucket replication configured to forward the data to the final InfluxDB instance.

Our chain looks like:

On localhost:

process reading out the data and sending records (in line protocol format) to mosquitto (MQTT) on localhost
mosquitto has configured a bridge to forward records to central system (to support "just now" picture)
telegraf agent using mqtt_client input plugin writing data to localhost influxdb
InfluxDB:
- storing data in given bucket, possibly having retention set to short period such as 2 days
- having configured replication on given bucket, forwarding all incoming data to the central instance

On central instance:

mosquitto (MQTT): receiving line protocol messages sent from localhosts
InfluxDB: receiving replicated metrics

I came to this issue to find out, if current telegraf would be an option to replace localhost InfluxDB instance and make the chain a bit simpler.

pratikdas44 · 2023-06-02T03:27:16Z

@vlcinsky so Telegraf is also sending metrics from localhost to mqtt installed in the central instance?

vlcinsky · 2023-06-02T19:45:19Z

@pratikdas44
No, there is no need to use telegraf for replicating mqtt messages.

We are using directly mosquitto bridge, which allows for replication of messages between two mosquitto instances.

relevant part of mosquitto.conf could look like:

connection bridge
address mqttcentralserver.acme.com:1883
remote_username john
remote_password secretpassword
topic probe/+ out 0

It belongs to the localhost/edge instance, it listens on local topic probe/+ (e.g. probe/sn1234 or probe/sn5678) and tries to send the messages to mqttcentralserver.acme.com:1883. In this case it does not change topic on the mqttcentralserver.

I allows for persistent queue in case connectivity is lost but as we use mqtt just to deliver current messages to the central server, we do not use this queuing. For safe delivery of all messages we use the replication of InfluxDB bucket from locat/edge instance to the central InfluxDB one.

For the InfluxDB replication we use InfluxDB OSS v2.7 and configure it as described here: https://docs.influxdata.com/influxdb/v2.7/write-data/replication/replicate-data/

willhope · 2023-09-27T10:01:47Z

+1 for this feature！Vmagent has this capability （Works smoothly in environments with unstable connections to remote storage. If the remote storage is unavailable, the collected metrics are buffered at -remoteWrite.tmpDataPath. The buffered metrics are sent to remote storage as soon as the connection to the remote storage is repaired. The maximum disk usage for the buffer can be limited with -remoteWrite.maxDiskUsagePerURL.）

Hipska · 2024-03-06T08:25:58Z

Hey all, please have a look at the spec doc for implementing this feature and give your feedback if needed: #14928

DStrand1 · 2024-07-31T19:33:43Z

We've implemented this feature in #15564 and landed it in the latest nightly builds, we would appreciate any feedback on it!

kostasb added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Oct 14, 2016

sparrc mentioned this issue Nov 22, 2016

Data point cache to HDD in case of Internet outage for long time #2064

Closed

joezhoujinjing mentioned this issue Nov 22, 2016

Fully transferring all measurements/datapoints from one influxdb to the remote one. #2067

Closed

sparrc mentioned this issue Jan 9, 2017

Feature to "pause" input message queue consumers while output(s) are down #2240

Closed

dirkdevriendt mentioned this issue Jun 10, 2017

[Feature Discussion] Dealing with congestion by adding internal buffer handling options #2905

Closed

danielnelson mentioned this issue Sep 12, 2017

Feature request: flush to file if write on shutdown fails #3221

Closed

danielnelson added feature request Requests for new plugin and for new features to existing plugins and removed feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin labels Apr 6, 2018

danielnelson changed the title ~~[feature request] Output buffer persistence~~ Output buffer persistence Nov 12, 2018

danielnelson mentioned this issue Jan 21, 2020

Persist Telegraf Metric Buffer to filesystem #6925

Closed

sjwang90 closed this as completed Jul 28, 2020

ssoroka reopened this Jul 28, 2020

sjwang90 added the pm/core capability Feature requests that impact and improve core Telegraf label Nov 24, 2020

jrc mentioned this issue May 24, 2021

Ensure buffer is written to Influx even when there's no network connection #4963

Closed

DStrand1 mentioned this issue Mar 4, 2024

docs(specs): Add specification for output buffer persistence strategy #14928

Merged

1 task

DStrand1 mentioned this issue Apr 24, 2024

feat(agent): output buffer persistence #15221

Closed

1 task

This was referenced May 13, 2024

influxdb_v2_listener rate limit #15340

Closed

feat(processors): Traffic shaper processor plugin to shape uneven distribution of incoming metrics #15353

Closed

This was referenced Jun 20, 2024

chore(agent): Extract buffer into interface #15545

Merged

feat(agent): Add metric disk buffer #15564

Merged

Output buffer persistence #802

Output buffer persistence #802

Comments

kostasb commented Mar 7, 2016 • edited Loading

joezhoujinjing commented Nov 22, 2016

sparrc commented Nov 22, 2016

biker73 commented Jan 9, 2017

bfgoodrich commented May 24, 2017

kt97679 commented Apr 6, 2018

voiprodrigo commented Apr 13, 2018

voiprodrigo commented Sep 28, 2018

Jaeyo commented Sep 28, 2018

voiprodrigo commented Feb 6, 2019

danielnelson commented Feb 6, 2019

PWSys commented Jan 21, 2020

markusr commented Jan 23, 2020

PWSys commented Jan 27, 2020

darinfisher commented Jul 28, 2020

ssoroka commented Jul 28, 2020

mrdavidaylward commented Aug 12, 2020 • edited Loading

ssoroka commented Aug 13, 2020

russorat commented Aug 13, 2020

rightkick commented Oct 1, 2020

rightkick commented Nov 16, 2020

naorw commented Nov 24, 2020

v-j-f commented Apr 14, 2021

ssoroka commented Apr 19, 2021 • edited Loading

v-j-f commented Apr 20, 2021

leventov commented Apr 28, 2021 • edited Loading

ssoroka commented May 14, 2021

WireFr33 commented Nov 24, 2021

jhychan commented Apr 23, 2022

andoks commented Aug 5, 2022 • edited Loading

n3k232 commented Nov 24, 2022

TsubasaBE commented Jan 29, 2023

yash1234singh commented Feb 9, 2023

pratikdas44 commented Feb 21, 2023 • edited Loading

WireFr33 commented Feb 21, 2023

vlcinsky commented Jun 1, 2023

pratikdas44 commented Jun 2, 2023

vlcinsky commented Jun 2, 2023

willhope commented Sep 27, 2023 • edited Loading

Hipska commented Mar 6, 2024

DStrand1 commented Jul 31, 2024

kostasb commented Mar 7, 2016 •

edited

Loading

mrdavidaylward commented Aug 12, 2020 •

edited

Loading

ssoroka commented Apr 19, 2021 •

edited

Loading

leventov commented Apr 28, 2021 •

edited

Loading

andoks commented Aug 5, 2022 •

edited

Loading

pratikdas44 commented Feb 21, 2023 •

edited

Loading

willhope commented Sep 27, 2023 •

edited

Loading