You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cat $INFLUXDB_HOME
cat: /root/healthmonitor/influxdb-0.13.0-1: Is a directory
root@influxdb-hybrid-860058:# free
total used free shared buffers cached
Mem: 12305660 1735684 10569976 388 8456 157632
-/+ buffers/cache: 1569596 10736064
Swap: 0 0 0
root@influxdb-hybrid-860058:# uname -a
Linux influxdb-hybrid-860058 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
root@influxdb-hybrid-860058:~#
Query Set.
$INFLUXDB_HOME/usr/bin/influx -host localhost -port 8086 -precision rfc3339 -database ep -execute 'SELECT count(distinct(eId)) as units INTO totalAlerts FROM epdetail WHERE time > now() - 220d GROUP BY time(1d) fill(none)'
$INFLUXDB_HOME/usr/bin/influx -host localhost -port 8086 -precision rfc3339 -database ep -execute 'SELECT count(distinct(eId)) as units INTO bualerts FROM epdetail WHERE time > now() - 220d GROUP BY time(1d), businessUnit fill(none)'
$INFLUXDB_HOME/usr/bin/influx -host localhost -port 8086 -precision rfc3339 -database ep -execute 'SELECT count(distinct(eId)) as units INTO siteIdAlerts FROM epdetail WHERE time > now() - 220d GROUP BY time(1d), siteId fill(none)'
$INFLUXDB_HOME/usr/bin/influx -host localhost -port 8086 -precision rfc3339 -database ep -execute 'SELECT count(distinct(eId)) as units INTO channelIdAlerts FROM epdetail WHERE time > now() - 220d GROUP BY time(1d), channelId fill(none)'
After i ingest the raw data and run above set of commands to create a high level view of the system broken down at various levels the system crashes with OOM on version 0.13 and barely runs with version 0.11
We have our entire monitoring system built on Grafana that uses InfluxDB at backend.
Questions
0. OOM occurs immediately with version 13.0-1 and not with 0.11.0-1. After used memory went up to 11GB (out of 12GB) for 0.11.0-1, and that means it can crash anytime.
How do i fix this OOM issue ?
What changed in version 13.0-1 that is causing OOM.
Is all the data stored in memory ? If so then am limited with the size of memory. Currently i have only 12GB ram (cloud, difficult to get big boxes) and its only a single node system.
Please advise.
[httpd] 2016/07/09 11:58:54 10.103.178.199 - root [09/Jul/2016:11:58:54 -0700] GET /query?db=ep&epoch=ms&q=select++%28failureCount%2FtotalCount%29%2A100+from+epsummary+where+alertId%3D%273%27+AND+time+%3E+now%28%29+-+100d HTTP/1.1 200 782 http://healthmonitor-860059.lvs01.eaz.ebayc3.com:8080/dashboard/db/experimentation-anomalies Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36 2ee3c79f-4607-11e6-81dc-000000000000 69.318753ms
[httpd] 2016/07/09 11:58:54 10.103.178.199 - root [09/Jul/2016:11:58:54 -0700] GET /query?db=ep&epoch=ms&q=select++%28failureCount%2FtotalCount%29%2A100+from+epsummary+where+alertId%3D%274%27+AND+time+%3E+now%28%29+-+100d HTTP/1.1 200 659 http://healthmonitor-860059.lvs01.eaz.ebayc3.com:8080/dashboard/db/experimentation-anomalies Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36 2ee82d39-4607-11e6-81de-000000000000 62.329095ms
[httpd] 2016/07/09 12:03:01 127.0.0.1 - - [09/Jul/2016:12:03:01 -0700] GET /ping HTTP/1.1 204 0 - InfluxDBShell/0.13.0 c23f0b0b-4607-11e6-81e2-000000000000 106.072µs
[query] 2016/07/09 12:03:01 SELECT count(distinct(eId)) AS units INTO ep."default".totalAlerts FROM ep."default".epdetail WHERE time > now() - 220d GROUP BY time(1d) fill(none)
fatal error: runtime: out of memory
Version : influxdb-0.13.0-1 & influxdb-0.11.0-1
Machine Details (Single Node System)
cat $INFLUXDB_HOME
cat: /root/healthmonitor/influxdb-0.13.0-1: Is a directory
root@influxdb-hybrid-860058:
# free# uname -atotal used free shared buffers cached
Mem: 12305660 1735684 10569976 388 8456 157632
-/+ buffers/cache: 1569596 10736064
Swap: 0 0 0
root@influxdb-hybrid-860058:
Linux influxdb-hybrid-860058 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
root@influxdb-hybrid-860058:~#
Query Set.
$INFLUXDB_HOME/usr/bin/influx -host localhost -port 8086 -precision rfc3339 -database ep -execute 'SELECT count(distinct(eId)) as units INTO totalAlerts FROM epdetail WHERE time > now() - 220d GROUP BY time(1d) fill(none)'
$INFLUXDB_HOME/usr/bin/influx -host localhost -port 8086 -precision rfc3339 -database ep -execute 'SELECT count(distinct(eId)) as units INTO bualerts FROM epdetail WHERE time > now() - 220d GROUP BY time(1d), businessUnit fill(none)'
$INFLUXDB_HOME/usr/bin/influx -host localhost -port 8086 -precision rfc3339 -database ep -execute 'SELECT count(distinct(eId)) as units INTO siteIdAlerts FROM epdetail WHERE time > now() - 220d GROUP BY time(1d), siteId fill(none)'
$INFLUXDB_HOME/usr/bin/influx -host localhost -port 8086 -precision rfc3339 -database ep -execute 'SELECT count(distinct(eId)) as units INTO channelIdAlerts FROM epdetail WHERE time > now() - 220d GROUP BY time(1d), channelId fill(none)'
After i ingest the raw data and run above set of commands to create a high level view of the system broken down at various levels the system crashes with OOM on version 0.13 and barely runs with version 0.11
We have our entire monitoring system built on Grafana that uses InfluxDB at backend.
Questions
0. OOM occurs immediately with version 13.0-1 and not with 0.11.0-1. After used memory went up to 11GB (out of 12GB) for 0.11.0-1, and that means it can crash anytime.
Please advise.
[httpd] 2016/07/09 11:58:54 10.103.178.199 - root [09/Jul/2016:11:58:54 -0700] GET /query?db=ep&epoch=ms&q=select++%28failureCount%2FtotalCount%29%2A100+from+epsummary+where+alertId%3D%273%27+AND+time+%3E+now%28%29+-+100d HTTP/1.1 200 782 http://healthmonitor-860059.lvs01.eaz.ebayc3.com:8080/dashboard/db/experimentation-anomalies Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36 2ee3c79f-4607-11e6-81dc-000000000000 69.318753ms
[httpd] 2016/07/09 11:58:54 10.103.178.199 - root [09/Jul/2016:11:58:54 -0700] GET /query?db=ep&epoch=ms&q=select++%28failureCount%2FtotalCount%29%2A100+from+epsummary+where+alertId%3D%274%27+AND+time+%3E+now%28%29+-+100d HTTP/1.1 200 659 http://healthmonitor-860059.lvs01.eaz.ebayc3.com:8080/dashboard/db/experimentation-anomalies Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36 2ee82d39-4607-11e6-81de-000000000000 62.329095ms
[httpd] 2016/07/09 12:03:01 127.0.0.1 - - [09/Jul/2016:12:03:01 -0700] GET /ping HTTP/1.1 204 0 - InfluxDBShell/0.13.0 c23f0b0b-4607-11e6-81e2-000000000000 106.072µs
[query] 2016/07/09 12:03:01 SELECT count(distinct(eId)) AS units INTO ep."default".totalAlerts FROM ep."default".epdetail WHERE time > now() - 220d GROUP BY time(1d) fill(none)
fatal error: runtime: out of memory
runtime stack:
runtime.throw(0xce4740, 0x16)
/usr/local/go/src/runtime/panic.go:547 +0x90
runtime.sysMap(0xcad9c70000, 0x100000, 0x7fcfd1fb8c00, 0x10efab8)
/usr/local/go/src/runtime/mem_linux.go:206 +0x9b
runtime.(_mheap).sysAlloc(0x10d4de0, 0x100000, 0x0)
/usr/local/go/src/runtime/malloc.go:429 +0x191
runtime.(_mheap).grow(0x10d4de0, 0x8, 0x0)
/usr/local/go/src/runtime/mheap.go:651 +0x63
runtime.(_mheap).allocSpanLocked(0x10d4de0, 0x3, 0x7fcfaa8d14a0)
/usr/local/go/src/runtime/mheap.go:553 +0x4f6
runtime.(_mheap).alloc_m(0x10d4de0, 0x3, 0x2000000002c, 0x7fcfaa8d14a0)
/usr/local/go/src/runtime/mheap.go:437 +0x119
runtime.(_mheap).alloc.func1()
/usr/local/go/src/runtime/mheap.go:502 +0x41
runtime.systemstack(0x7fcfd1fb8d60)
/usr/local/go/src/runtime/asm_amd64.s:307 +0xab
runtime.(_mheap).alloc(0x10d4de0, 0x3, 0x1000000002c, 0x412674)
/usr/local/go/src/runtime/mheap.go:503 +0x63
runtime.(_mcentral).grow(0x10d73f0, 0x0)
/usr/local/go/src/runtime/mcentral.go:209 +0x93
runtime.(_mcentral).cacheSpan(0x10d73f0, 0xc85dbae1f0)
/usr/local/go/src/runtime/mcentral.go:89 +0x47d
runtime.(*mcache).refill(0x7fcfd50d9000, 0x2c, 0xc83664d6c0)
/usr/local/go/src/runtime/mcache.go:119 +0xcc
runtime.mallocgc.func2()
/usr/local/go/src/runtime/malloc.go:642 +0x2b
runtime.systemstack(0xc82001e000)
/usr/local/go/src/runtime/asm_amd64.s:291 +0x79
runtime.mstart()
/usr/local/go/src/runtime/proc.go:1051
And there is a very long stack trace after this. If required, i can share it.
The text was updated successfully, but these errors were encountered: