[0.12.0] Too much write OPS #6058

simnv · 2016-03-19T08:55:14Z

Centos 6.7 64-bit
InfluxDB 0.10.3 installed from yum repo, db created after clean install
VM with 8 vCPU, 24GB of RAM, ext4 SAN Datastore

I have a test setup using opentsdb input to collect data from several bosun scollectors feeding little above 8k metrics per second. I have two continuous queries, one running every ten minutes, other running every hour, both just storing mean values into other retentions.

My problem is that influx constantly makes write operations on disk. About 4k ops. It doesn't consume much memory, but those writes are annoying: why does an application that I use for monitoring one virtual platform is the most resource-consuming application on that platform?

Tried playing with WAL limits, tried setting ready two, then ten times higher, tried other parameters, no effect. Write is always happening.

For now I just placed WAL on tmpfs, and it works quite normal: almost no visible disk activity, data is just stored normally. Making rsync every minute to store it on a disk for persisting across reboots, don't know if it's a good practice though.

Read ops are positive, write ops are negative values.

Is this a normal behavior for influx? How can I tune influx to make those write operations to occur more rarely?

mark-rushakoff · 2016-03-19T18:39:35Z

I have a test setup using opentsdb input to collect data from several bosun scollectors feeding little above 8k metrics per second.

Are you batching your points?

When a write comes in to Influx, the points in the write are stored in both the in-memory TSM cache (for query performance) and the on-disk WAL (for permanent storage, to be eventually snapshotted and compacted into TSM files).

The WAL is grouped by retention policy and database, both of which are fixed per batch; therefore one POST to /write or one OpenTSDB write action should result in approximately one write to the filesystem, regardless of whether your batch was 1 point or 1000 points. Whether the filesystem is flushed to disk immediately or batched up later is dependent on your filesystem, operating system, and disk controllers – all things outside the control of InfluxDB.

simnv · 2016-03-20T04:50:53Z

Thank you for the suggestion. Metrics were batched in 500 points per host.

I have set batches to about 10000 points per host, depending on a host, but still have the same periods of writes every five minutes, for four minutes out of five:

Metrics at the same time:

I think I can afford to lose several minutes of data in case of an unlikely disaster. Should I remain on tmpfs, or is there more elegant way to have influxdb use memory more than writing on a disk?

earthnut · 2016-03-21T23:05:20Z

@mark-rushakoff one question about the subscriber .
will the data be send to subscriber after in-memory TSM cache and on-disk WAL flushed to storage or at the same time inlfuxDB received it.

mark-rushakoff · 2016-03-21T23:18:24Z

@simnv something seems weird about your disk activity. Is that a network drive? HDD or SSD? I'm not aware of other reports of that kind of heavy disk activity so I'm inclined to believe something is unusual about your setup.

@earthnut it doesn't appear that there's any guarantee about the order of points sent to subscribers vs. flushed to shards: https://github.com/influxdata/influxdb/blob/d024ca2/cluster/points_writer.go#L207-L230

simnv · 2016-04-01T03:23:24Z

After upgrading from 0.10.3 to 0.11.0 the problem has gone away:

With on-disk WAL folder write ops grew only slightly.

simnv · 2016-04-05T04:54:19Z

I don't understand. Influx was working fine with little above than 20 write ops per second for several days, then after restart it has returned to 3k+ write ops:

Config hasn't changed at all, data inputs are same. Btw, it was updated from 0.11.0 to 0.11.1 before restart.

simnv · 2016-04-07T05:59:57Z

Now, after upgrading to 0.12, I have not only high wal write ops, but also a high write ops on the fs with data directory.

simnv · 2016-04-07T07:02:20Z

Attached strace to influxdb process. Most iops are for wal files (3k+ write iops, understandable) and for meta/meta.dbtmp file (2k+ write iops, each time for 4754 bytes). Last part is strange for me.

simnv · 2016-04-07T10:18:48Z

Dug a little deeper. Every operation leads to cacheData.Index increment, which leads to meta.db flush on disk. It is flushed with f.Sync() command, so no help in write caching:
https://github.com/influxdata/influxdb/blob/master/services/meta/client.go#L979

Made several diffs of this meta.db file, looks like only index is incremented when there is no admin activity (no new users, cqs etc created). Then why should we flush this file on disk with every commit? Especially in standalone setup.

Moved WAL and meta.db to tmpfs, syncing it to disk every minute. Average IOPS dropped to about 10. Stopped influxdb, cleared WAL, started influxdb again. Lost only data that was in WAL (obviously). Copied meta.db, waited for about 30 minutes, stopped influxdb, replaced meta.db with 30 minutes old version, started influxdb - no data lost, db works fine. I can definitely live with that.

So, in the end, data is flushed on every batch of points received. And it is flushed not only in WAL, but it also fires metadata update and flush.

Can someone explain to me, what's the point of that? Why make constant writes on disk damaging it in the process instead of making syncs in sane intervals of time like once every seconds, for example? Why should I use terrible crutches of tmpfs and rsync to make those updates less frequent? I hope I just didn't configured influxdb right. But judging from the sources, it works like that.

simnv · 2016-04-14T04:39:50Z

Updated to 0.12.1, problem has gone. Storing 9k+ metrics per second I have only 25 write operations per second to wal partition and occasional writes to data partition.

Thanks!

toddboom · 2016-04-14T05:33:21Z

@simnv awesome, thanks for the update!

simnv changed the title ~~[0.10.3] Too much write IOPS~~ [0.10.3] Too much write OPS Mar 19, 2016

simnv closed this as completed Apr 1, 2016

simnv reopened this Apr 5, 2016

simnv changed the title ~~[0.10.3] Too much write OPS~~ [0.12.0] Too much write OPS Apr 7, 2016

simnv mentioned this issue Apr 7, 2016

I/O Behaviour after Upgrade to 0.12.0-1 #6238

Closed

corylanou mentioned this issue Apr 7, 2016

CreateShardGroup was not idempotent at the meta store level #6257

Merged

3 tasks

simnv closed this as completed Apr 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[0.12.0] Too much write OPS #6058

[0.12.0] Too much write OPS #6058

simnv commented Mar 19, 2016

mark-rushakoff commented Mar 19, 2016

simnv commented Mar 20, 2016

earthnut commented Mar 21, 2016

mark-rushakoff commented Mar 21, 2016

simnv commented Apr 1, 2016

simnv commented Apr 5, 2016

simnv commented Apr 7, 2016

simnv commented Apr 7, 2016

simnv commented Apr 7, 2016

simnv commented Apr 14, 2016

toddboom commented Apr 14, 2016

[0.12.0] Too much write OPS #6058

[0.12.0] Too much write OPS #6058

Comments

simnv commented Mar 19, 2016

mark-rushakoff commented Mar 19, 2016

simnv commented Mar 20, 2016

earthnut commented Mar 21, 2016

mark-rushakoff commented Mar 21, 2016

simnv commented Apr 1, 2016

simnv commented Apr 5, 2016

simnv commented Apr 7, 2016

simnv commented Apr 7, 2016

simnv commented Apr 7, 2016

simnv commented Apr 14, 2016

toddboom commented Apr 14, 2016