Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

metric-tank crash in production #88

Closed
woodsaj opened this issue Dec 23, 2015 · 5 comments
Closed

metric-tank crash in production #88

woodsaj opened this issue Dec 23, 2015 · 5 comments

Comments

@woodsaj
Copy link
Member

woodsaj commented Dec 23, 2015

2015/12/23 06:00:22 ESC[1;33m[W] failed to save chunk to cassandra after 1 attempts. <chunk T0=1450849800, LastTs=1
450850351, NumPoints=10, Saved=false>, gocql: no response received from cassandra within timeout periodESC[0m
2015/12/23 06:00:22 ESC[1;33m[W] failed to save chunk to cassandra after 1 attempts. <chunk T0=1450849800, LastTs=1
450850351, NumPoints=10, Saved=false>, gocql: no response received from cassandra within timeout periodESC[0m
2015/12/23 06:00:22 ESC[1;33m[W] failed to save chunk to cassandra after 1 attempts. <chunk T0=1450849800, LastTs=1
450850357, NumPoints=10, Saved=false>, gocql: no response received from cassandra within timeout periodESC[0m
2015/12/23 06:00:22 ESC[1;33m[W] failed to save chunk to cassandra after 1 attempts. <chunk T0=1450849800, LastTs=1
450850351, NumPoints=10, Saved=false>, gocql: no response received from cassandra within timeout periodESC[0m
2015/12/23 06:00:22 ESC[1;33m[W] failed to save chunk to cassandra after 1 attempts. <chunk T0=1450849800, LastTs=1
450850357, NumPoints=10, Saved=false>, gocql: no response received from cassandra within timeout periodESC[0m
fatal error: runtime: out of memory
runtime stack:
runtime.throw(0xa19490, 0x16)
        /usr/local/go/src/runtime/panic.go:527 +0x90
runtime.sysMap(0xc8144bc000, 0x8000, 0xc996770000, 0xc7c438)
        /usr/local/go/src/runtime/mem_linux.go:143 +0x9b
runtime.mHeap_MapBits(0xc5cf60, 0xc996870000)
        /usr/local/go/src/runtime/mbitmap.go:144 +0xcc
runtime.mHeap_SysAlloc(0xc5cf60, 0x100000, 0x21bd)
        /usr/local/go/src/runtime/malloc.go:424 +0x186
runtime.mHeap_Grow(0xc5cf60, 0x8, 0x0)
@woodsaj
Copy link
Member Author

woodsaj commented Dec 23, 2015

full log has been saved to /root/issue88/metric_tank.log.1.gz on metric-tank-2-prod

@woodsaj
Copy link
Member Author

woodsaj commented Dec 23, 2015

Might need to increase the cassandra timeout value. It is using the default of 600ms

@woodsaj
Copy link
Member Author

woodsaj commented Dec 23, 2015

The server still had ~20Gb of memory free. Looks like it could be due to this bug.

golang/go#12233

@Dieterbe
Copy link
Contributor

well luckily for us the fix was included in the recent 1.5.2 release , https://github.com/golang/go/issues?q=milestone%3AGo1.5.2 so can we recompile with that?

@woodsaj
Copy link
Member Author

woodsaj commented Jan 8, 2016

The changes introduce in PR #95 should have resolved this as we are no longer creating hundreds of thousands of goroutines.

If this re-occurs then we will need to modify our circleci scripts to use the Go1.5.2 instead of 1.5.1 which is provided in the default image.

@woodsaj woodsaj closed this as completed Jan 8, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants