Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.9.5] OOM and timeout issue #4951

Closed
seyoonhan opened this issue Dec 2, 2015 · 9 comments
Closed

[0.9.5] OOM and timeout issue #4951

seyoonhan opened this issue Dec 2, 2015 · 9 comments

Comments

@seyoonhan
Copy link

I applied influxDB(0.9.5) to my production environment, but It dies too frequently yelling out of memory error.
(and there are bunch of timeout logs, both tsm1 and bz1)

input : 30000 points/sec across 20 kinds of table(serie?)
Is there any solution or tip for performance?
If I'm using influxDB in wrong way, please let me know.

hardware spec and error logs are below.

hardware

  • cpu /
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 45
Stepping: 7
CPU MHz: 1999.995
BogoMIPS: 3999.44
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 15360K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23
  • storage/
    SSD enough space
  • ram/
    64GB

Error log is like this(OOM case)

Nov 26 18:30:11 LGRNGDM7513 kernel: influxd invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Nov 26 18:30:11 LGRNGDM7513 kernel: influxd cpuset=/ mems_allowed=0-1
Nov 26 18:30:11 LGRNGDM7513 kernel: Pid: 33863, comm: influxd Not tainted 2.6.32-504.16.2.el6.x86_64 #1
Nov 26 18:30:11 LGRNGDM7513 kernel: Call Trace:
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff810d41b1>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff81127410>] ? dump_header+0x90/0x1b0
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8122ee4c>] ? security_real_capable_noaudit+0x3c/0x70
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff81127892>] ? oom_kill_process+0x82/0x2a0
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff811277d1>] ? select_bad_process+0xe1/0x120
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff81127cd0>] ? out_of_memory+0x220/0x3c0
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8113460f>] ? __alloc_pages_nodemask+0x89f/0x8d0
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8116c8ba>] ? alloc_pages_current+0xaa/0x110
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff81124807>] ? __page_cache_alloc+0x87/0x90
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff811241ee>] ? find_get_page+0x1e/0xa0
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff811257a7>] ? filemap_fault+0x1a7/0x500
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8114ec14>] ? __do_fault+0x54/0x530
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8106d534>] ? enqueue_task_fair+0x64/0x100
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8114f1e7>] ? handle_pte_fault+0xf7/0xb00
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff81064bf0>] ? wake_up_state+0x10/0x20
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff810b240c>] ? wake_futex+0x3c/0x60
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8114fe89>] ? handle_mm_fault+0x299/0x3d0
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8104d096>] ? __do_page_fault+0x146/0x500
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff811d7bb4>] ? ep_poll+0x314/0x350
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff81529ba6>] ? schedule+0x176/0x3a0
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8100bc6e>] ? invalidate_interrupt3+0xe/0x20
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8153041e>] ? do_page_fault+0x3e/0xa0
Nov 26 18:30:11 LGRNGDM7513 kernel: [<ffffffff8152d7d5>] ? page_fault+0x25/0x30
Nov 26 18:30:11 LGRNGDM7513 kernel: Mem-Info:
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 0 DMA per-cpu:
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 0: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 1: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 2: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 3: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 4: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 5: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 6: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 7: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 8: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 9: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 10: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 11: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 12: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 13: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 14: hi: 0, btch: 1 usd: 0Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 15: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 16: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 17: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 18: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 19: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 20: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 21: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 22: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 23: hi: 0, btch: 1 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 0 DMA32 per-cpu:
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 0: hi: 186, btch: 31 usd: 30
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 1: hi: 186, btch: 31 usd: 1
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 2: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 3: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 4: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 5: hi: 186, btch: 31 usd: 1
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 6: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 7: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 8: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 9: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 10: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 11: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 12: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 13: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 14: hi: 186, btch: 31 usd: 30
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 15: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 16: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 17: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 18: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 19: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 20: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 21: hi: 186, btch: 31 usd: 1
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 22: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 23: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 0 Normal per-cpu:
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 0: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 1: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 2: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 3: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 4: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 5: hi: 186, btch: 31 usd: 2
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 6: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 7: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 8: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 9: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 10: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 11: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 12: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 13: hi: 186, btch: 31 usd: 1
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 14: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 15: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 16: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 17: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 18: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 19: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 20: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 21: hi: 186, btch: 31 usd: 8
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 22: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 23: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 1 Normal per-cpu:
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 0: hi: 186, btch: 31 usd: 3
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 1: hi: 186, btch: 31 usd: 13
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 2: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 3: hi: 186, btch: 31 usd: 31
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 4: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 5: hi: 186, btch: 31 usd: 31
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 6: hi: 186, btch: 31 usd: 7
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 7: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 8: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 9: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 10: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 11: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 12: hi: 186, btch: 31 usd: 2
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 13: hi: 186, btch: 31 usd: 20
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 14: hi: 186, btch: 31 usd: 30
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 15: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 16: hi: 186, btch: 31 usd: 32
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 17: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 18: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 19: hi: 186, btch: 31 usd: 18
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 20: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 21: hi: 186, btch: 31 usd: 31
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 22: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: CPU 23: hi: 186, btch: 31 usd: 0
Nov 26 18:30:11 LGRNGDM7513 kernel: active_anon:15257901 inactive_anon:1020508 isolated_anon:64
Nov 26 18:30:11 LGRNGDM7513 kernel: active_file:77 inactive_file:0 isolated_file:0
Nov 26 18:30:11 LGRNGDM7513 kernel: unevictable:0 dirty:0 writeback:0 unstable:0
Nov 26 18:30:11 LGRNGDM7513 kernel: free:41026 slab_reclaimable:3970 slab_unreclaimable:12377
Nov 26 18:30:11 LGRNGDM7513 kernel: mapped:229 shmem:0 pagetables:43864 bounce:0
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 0 DMA free:15732kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15336
Nov 26 18:30:11 LGRNGDM7513 kernel: lowmem_reserve[]: 0 3211 32248 32248
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 0 DMA32 free:117564kB min:1616kB low:2020kB high:2424kB active_anon:2033420kB inactive_anon:508452kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(
Nov 26 18:30:11 LGRNGDM7513 kernel: lowmem_reserve[]: 0 0 29037 29037
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 0 Normal free:14600kB min:14624kB low:18280kB high:21936kB active_anon:28037220kB inactive_anon:1752276kB active_file:72kB inactive_file:0kB unevictable:0kB isolated(anon):256kB i
Nov 26 18:30:11 LGRNGDM7513 kernel: lowmem_reserve[]: 0 0 0 0
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 1 Normal free:16208kB min:16276kB low:20344kB high:24412kB active_anon:30960964kB inactive_anon:1821304kB active_file:236kB inactive_file:0kB unevictable:0kB isolated(anon):0kB is
Nov 26 18:30:11 LGRNGDM7513 kernel: lowmem_reserve[]: 0 0 0 0
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 0 DMA: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15732kB
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 0 DMA32: 613*4kB 417*8kB 344*16kB 203*32kB 89*64kB 19*128kB 6*256kB 6*512kB 3*1024kB 7*2048kB 17*4096kB = 117564kB
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 0 Normal: 2666*4kB 26*8kB 4*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 15032kB
Nov 26 18:30:11 LGRNGDM7513 kernel: Node 1 Normal: 3005*4kB 36*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 16420kB
Nov 26 18:30:11 LGRNGDM7513 kernel: 100188 total pagecache pages
Nov 26 18:30:11 LGRNGDM7513 kernel: 99585 pages in swap cache
Nov 26 18:30:11 LGRNGDM7513 kernel: Swap cache stats: add 631336, delete 531751, find 16326/27899
Nov 26 18:30:11 LGRNGDM7513 kernel: Free swap = 0kB
Nov 26 18:30:11 LGRNGDM7513 kernel: Total swap = 2097148kB
Nov 26 18:30:11 LGRNGDM7513 kernel: 16777215 pages RAM
Nov 26 18:30:11 LGRNGDM7513 kernel: 296970 pages reserved
Nov 26 18:30:11 LGRNGDM7513 kernel: 689 pages shared
Nov 26 18:30:11 LGRNGDM7513 kernel: 16432193 pages non-shared
Nov 26 18:30:11 LGRNGDM7513 kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
Nov 26 18:30:11 LGRNGDM7513 kernel: [ 3081] 0 3081 4589 41 0 0 0 irqbalance
Nov 26 18:30:11 LGRNGDM7513 kernel: [ 3126] 0 3126 1092 1 14 0 0 mcelog
Nov 26 18:30:11 LGRNGDM7513 kernel: [ 3165] 0 3165 5600 2 10 0 0 xinetd
Nov 26 18:30:11 LGRNGDM7513 kernel: [ 3393] 0 3393 129460 125 0 0 0 dsm_sa_datamgrd
Nov 26 18:30:11 LGRNGDM7513 kernel: [ 3571] 0 3571 69809 25 0 0 0 dsm_sa_eventmgr
Nov 26 18:30:11 LGRNGDM7513 kernel: [ 3608] 0 3608 159947 52 17 0 0 dsm_om_shrsvcd
Nov 26 18:30:11 LGRNGDM7513 kernel: [ 3721] 0 3721 5473 11 2 0 0 ipmievd
Nov 26 18:30:11 LGRNGDM7513 kernel: [ 3903] 0 3903 1016 2 0 0 0 mingetty
Nov 26 18:30:11 LGRNGDM7513 kernel: [ 3905] 0 3905 1016 2 18 0 0 mingetty
Nov 26 18:30:11 LGRNGDM7513 kernel: [139409] 0 139409 2376 1 14 -17 -1000 udevd
Nov 26 18:30:11 LGRNGDM7513 kernel: [140363] 0 140363 14429 15 0 -17 -1000 sshd
Nov 26 18:30:11 LGRNGDM7513 kernel: [140391] 0 140391 49616 148 1 0 0 snmpd
Nov 26 18:30:11 LGRNGDM7513 kernel: [140761] 0 140761 45957 51 1 0 0 rsyslogd
Nov 26 18:30:11 LGRNGDM7513 kernel: [140820] 28 140820 190937 71 1 0 0 nscd
Nov 26 18:30:11 LGRNGDM7513 kernel: [140855] 0 140855 4856 1 2 0 0 atd
Nov 26 18:30:11 LGRNGDM7513 kernel: [119639] 10000 119639 3297 9 15 0 0 noms_nsight
Nov 26 18:30:11 LGRNGDM7513 kernel: [119640] 10000 119640 158020 4329 10 0 0 noms_nsight
Nov 26 18:30:11 LGRNGDM7513 kernel: [105449] 10000 105449 3297 9 15 0 0 noms_nsight
Nov 26 18:30:11 LGRNGDM7513 kernel: [105450] 10000 105450 158002 4040 12 0 0 noms_nsight
Nov 26 18:30:11 LGRNGDM7513 kernel: [77302] 0 77302 5572 30 0 0 0 box
Nov 26 18:30:11 LGRNGDM7513 kernel: [77304] 0 77304 98797 174 1 0 0 box
Nov 26 18:30:11 LGRNGDM7513 kernel: [77343] 0 77343 2204 6 0 0 0 box
Nov 26 18:30:11 LGRNGDM7513 kernel: [77344] 0 77344 26058 32 15 0 0 box
Nov 26 18:30:11 LGRNGDM7513 kernel: [78490] 0 78490 28742 19 0 0 0 crond
Nov 26 18:30:11 LGRNGDM7513 kernel: [78535] 38 78535 6787 33 12 0 0 ntpd
Nov 26 18:30:11 LGRNGDM7513 kernel: [66967] 0 66967 5572 30 0 0 0 box
Nov 26 18:30:11 LGRNGDM7513 kernel: [66969] 0 66969 98796 199 1 0 0 box
Nov 26 18:30:11 LGRNGDM7513 kernel: [33699] 0 33699 23403698 16169787 13 0 0 influxd
@seyoonhan
Copy link
Author

Each serie has 2~3 tags.

@beckettsean
Copy link
Contributor

How many possible values do the tags have? RAM usage is highly dependent on the series cardinality. Is your cardinality approaching 10 million series?

I applied influxDB(0.9.5) to my production environment, but It dies too frequently yelling out of memory error.

Is this a fresh install or an upgrade?

(and there are bunch of timeout logs, both tsm1 and bz1)

Are you intermixing the two storage engines, or have you conducted separate tests with each?

@beckettsean
Copy link
Contributor

input : 30000 points/sec across 20 kinds of table(serie?)

Probably 20 measurements is what you mean. 30k pps is not a high load at all for this machine.

What do the InfluxDB logs show?

@seyoonhan
Copy link
Author

It's fresh installed one, about 20 measurements, and I conducted test tsm1 and bz1 separately.
If the performance is highly dependent about series cardinality, maybe that's the point what I should look into, because I'm running several continuous queries to measurements which have 2k~6k number of cardinality.

@beckettsean
Copy link
Contributor

If the performance is highly dependent about series cardinality

Write and query performance is not very dependent on series cardinality, only the RAM usage strongly correlates.

I'm running several continuous queries to measurements which have 2k~6k number of cardinality

CQs are just as expensive to run as a standard query, and running queries does take RAM. Can you share the output from SHOW CONTINUOUS QUERIES?

@seyoonhan
Copy link
Author

Thank you for your kind answer.

SHOW CONTINUOUS QUERIES doesn't return any result, but there are running CQs obviously, something like this.

CREATE CONTINUOUS QUERY "coinhist1m_4_cq" ON rangers_ts
BEGIN

SELECT
sum(freeBalanceVariation) as free_sum, sum(paidBalanceVariation) as paid_sum, count(flowType) as cnt
INTO "coinhist1m_4"
FROM release_PlayerCoinHistory
GROUP BY _flowType, time(1m)

END

Every CQ is much like above, grouping by a tag and time, counting number of rows.
Is there any possible performance/memory leak on the query? Please let me know.

*Another question. : )
The CQ above doesn't generate sum(freeBalanceVariation) as free_sum, sum(paidBalanceVariation) as paid_sum columns, but only count(flowType) as cnt column.
Can you advise me what am I missing, please?

@beckettsean
Copy link
Contributor

SHOW CONTINUOUS QUERIES doesn't return any result

That is difficult to understand. If that returns nothing, there are no CQs installed on the system. Are you executing the query via the Admin UI? If so, please be aware it is not intended to be the primary way to query the database. There are some bugs on SHOW queries where the results are not displayed. Use the CLI or direct curl commands, or examine the actual HTTP response in developer tools to find the output.

Can you run curl -G 'http://localhost:8086/query' --data-urlencode 'q=show continuous queries' or run the same command from the CLI, which is the influx binary?

Every CQ is much like above, grouping by a tag and time, counting number of rows. Is there any possible performance/memory leak on the query? Please let me know.

I don't see anything in that CQ that suggests a problem, although if there are enough CQs running it might cause contention.

The CQ above doesn't generate sum(freeBalanceVariation) as free_sum, sum(paidBalanceVariation) as paid_sum columns, but only count(flowType) as cnt column. Can you advise me what am I missing, please?

Please open a new issue for this, and paste in the results of querying the coinhist1m_4 measurement.

@seyoonhan
Copy link
Author

CQ list

> show continuous queries
name: rangers_ts
----------------
name            query
cq_ua1m_0       CREATE CONTINUOUS QUERY cq_ua1m_0 ON rangers_ts BEGIN SELECT count(userActivityType) INTO "rangers_ts"."ts_7d".ua1m_0 FROM "rangers_ts"."ts_7d".release_UA GROUP BY _userActivityType, time(1m) END
cq_coinhist1m       CREATE CONTINUOUS QUERY cq_coinhist1m ON rangers_ts BEGIN SELECT sum(freeBalanceVariation) AS "free_sum", sum(paidBalanceVariation) AS "paid_sum", count(flowType) AS "cnt" INTO "rangers_ts"."ts_7d".coinhist1m FROM "rangers_ts"."ts_7d".release_PlayerCoinHistory GROUP BY _flowType, time(1m) END
cq_equiphist1m      CREATE CONTINUOUS QUERY cq_equiphist1m ON rangers_ts BEGIN SELECT count(flowType) AS "cnt" INTO "rangers_ts"."ts_7d".equiphist1m FROM "rangers_ts"."ts_7d".release_PlayerUnitEquipItemHistory GROUP BY _flowType, time(1m) END
cq_hearthist1m      CREATE CONTINUOUS QUERY cq_hearthist1m ON rangers_ts BEGIN SELECT count(flowType) AS "cnt" INTO "rangers_ts"."ts_7d".hearthist1m FROM "rangers_ts"."ts_7d".release_PlayerHeartHistory GROUP BY _flowType, time(1m) END
cq_sphearthist1m    CREATE CONTINUOUS QUERY cq_sphearthist1m ON rangers_ts BEGIN SELECT count(flowType) AS "cnt" INTO "rangers_ts"."ts_7d".sphearthist1m FROM "rangers_ts"."ts_7d".release_PlayerSecondaryHeartHistory GROUP BY _flowType, time(1m) END
cq_unithist1m       CREATE CONTINUOUS QUERY cq_unithist1m ON rangers_ts BEGIN SELECT count(flowType) AS "cnt" INTO "rangers_ts"."ts_7d".unithist1m FROM "rangers_ts"."ts_7d".release_PlayerUnitHistory GROUP BY _flowType, time(1m) END
cq_gachahist1m      CREATE CONTINUOUS QUERY cq_gachahist1m ON rangers_ts BEGIN SELECT count(gachaId) AS "cnt" INTO "rangers_ts"."ts_7d".gachahist1m FROM "rangers_ts"."ts_7d".release_PlayerGachaHistory GROUP BY _gachaId, time(1m) END
cq_webshop1m        CREATE CONTINUOUS QUERY cq_webshop1m ON rangers_ts BEGIN SELECT count(code) AS "cnt" INTO "rangers_ts"."ts_7d".webshop1m FROM "rangers_ts"."ts_7d".release_WebshopRubyResult GROUP BY _code, time(1m) END
cq_iap1m        CREATE CONTINUOUS QUERY cq_iap1m ON rangers_ts BEGIN SELECT count(code) AS "cnt" INTO "rangers_ts"."ts_7d".iap1m FROM "rangers_ts"."ts_7d".release_IapRubyResult GROUP BY _code, time(1m) END

@seyoonhan
Copy link
Author

I've figured out the cause of OOM issue on my system.
It wasn't caused by CQs, but the some incogitant user of admin tool.
Some user used query without 'where time' phrase.
It seems there's no problem now. Thank you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants