Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.9.2] failed to write point batch to database #3710

Closed
deric opened this issue Aug 18, 2015 · 5 comments
Closed

[0.9.2] failed to write point batch to database #3710

deric opened this issue Aug 18, 2015 · 5 comments

Comments

@deric
Copy link

deric commented Aug 18, 2015

Influxdb crashes after not being able to write a batch.

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x28 pc=0x5e1305]

After several attempts of restart I've managed to start the process.

The issue might be related to #3699, #3603.

I have 2 databases, for the first one the datasource is collectd, for the other one it's graphite. Both databases have retention policy, though the second one is unlimited:

> show retention policies on collectd
name    duration        replicaN        default
default 336h0m0s        1               true

> show retention policies on graphite
name    duration        replicaN        default
default 0               1               true

part of the log:

[collectd] 2015/08/18 06:45:57 failed to write batch: timeout
[collectd] 2015/08/18 06:46:02 failed to write batch: timeout
[collectd] 2015/08/18 06:46:07 failed to write batch: timeout
[run] 2015/08/18 06:46:10 signal received, initializing clean shutdown...
[run] 2015/08/18 06:46:10 waiting for clean shutdown...
[snapshot] 2015/08/18 06:46:10 snapshot listener closed
[tcp] 2015/08/18 06:46:10 cluster service accept error: network connection closed
[collectd] 2015/08/18 06:46:12 failed to write batch: timeout
[metastore] 2015/08/18 06:46:12 [INFO] raft: Node at 127.0.0.1:8088 [Follower] entering Follower state
[metastore] 2015/08/18 06:46:12 read local node id: 1
[metastore] 2015/08/18 06:46:14 [WARN] raft: Heartbeat timeout reached, starting election
[metastore] 2015/08/18 06:46:14 [INFO] raft: Node at 127.0.0.1:8088 [Candidate] entering Candidate state
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Votes needed: 1
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Vote granted. Tally: 1
[metastore] 2015/08/18 06:46:14 [INFO] raft: Election won. Tally: 1
[metastore] 2015/08/18 06:46:14 [INFO] raft: Node at 127.0.0.1:8088 [Leader] entering Leader state
[metastore] 2015/08/18 06:46:14 [INFO] raft: Disabling EnableSingleNode (bootstrap)
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[metastore] 2015/08/18 06:46:14 [DEBUG] raft: Node 127.0.0.1:8088 updated peer set (2): [127.0.0.1:8088]
[graphite] 2015/08/18 06:46:15 failed to write point batch to database "graphite": timeout
[collectd] 2015/08/18 06:46:17 failed to write batch: timeout
[graphite] 2015/08/18 06:46:20 failed to write point batch to database "graphite": timeout
[collectd] 2015/08/18 06:46:22 failed to write batch: timeout
[snapshot] 2015/08/18 06:46:23 snapshot listener closed
[shard-precreation] 2015/08/18 06:46:23 precreation service terminating
[collectd] 2015/08/18 06:46:23 collectd UDP closed
[tcp] 2015/08/18 06:46:23 cluster service accept error: network connection closed
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x28 pc=0x5e1305]

goroutine 1 [running]:
github.com/influxdb/influxdb/services/graphite.(*Service).Close(0xc20805a0c0, 0x0, 0x0)
        /root/go/src/github.com/influxdb/influxdb/services/graphite/service.go:124 +0x45
github.com/influxdb/influxdb/cmd/influxd/run.(*Server).Close(0xc2080aa0e0, 0x0, 0x0)
        /root/go/src/github.com/influxdb/influxdb/cmd/influxd/run/server.go:357 +0x147
github.com/influxdb/influxdb/cmd/influxd/run.(*Server).Open(0xc2080aa0e0, 0x0, 0x0)
        /root/go/src/github.com/influxdb/influxdb/cmd/influxd/run/server.go:333 +0x85
github.com/influxdb/influxdb/cmd/influxd/run.(*Command).Run(0xc208011b90, 0xc20800a010, 0x4, 0x4, 0x0, 0x0)
        /root/go/src/github.com/influxdb/influxdb/cmd/influxd/run/command.go:97 +0x7ab
main.(*Main).Run(0xc20806fa80, 0xc20800a010, 0x4, 0x4, 0x0, 0x0)
        /root/go/src/github.com/influxdb/influxdb/cmd/influxd/main.go:76 +0x3bc
main.main()
        /root/go/src/github.com/influxdb/influxdb/cmd/influxd/main.go:38 +0xdc

goroutine 6 [syscall]:
os/signal.loop()
        /root/.gvm/gos/go1.4.2/src/os/signal/signal_unix.go:21 +0x1f
created by os/signal.init·1
        /root/.gvm/gos/go1.4.2/src/os/signal/signal_unix.go:27 +0x35

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
        /root/.gvm/gos/go1.4.2/src/runtime/asm_amd64.s:2232 +0x1

goroutine 14 [select]:
github.com/influxdb/influxdb/cmd/influxd/run.(*Server).monitorErrorChan(0xc2080aa0e0, 0xc20802c240)
        /root/go/src/github.com/influxdb/influxdb/cmd/influxd/run/server.go:421 +0x14a
created by github.com/influxdb/influxdb/cmd/influxd/run.func·003
        /root/go/src/github.com/influxdb/influxdb/cmd/influxd/run/server.go:304 +0x919
[graphite] 2015/08/18 06:46:25 failed to write point batch to database "graphite": timeout
[collectd] 2015/08/18 06:46:27 failed to write batch: timeout
[graphite] 2015/08/18 06:46:30 failed to write point batch to database "graphite": timeout
[collectd] 2015/08/18 06:46:32 failed to write batch: timeout
[graphite] 2015/08/18 06:46:35 failed to write point batch to database "graphite": timeout
[collectd] 2015/08/18 06:46:37 failed to write batch: timeout
[run] 2015/08/18 06:46:40 time limit reached, initializing hard shutdown
[metastore] 2015/08/18 06:56:14 [INFO] raft: Node at 127.0.0.1:8088 [Follower] entering Follower state
[metastore] 2015/08/18 06:56:14 read local node id: 1
[metastore] 2015/08/18 06:56:16 [WARN] raft: Heartbeat timeout reached, starting election
[metastore] 2015/08/18 06:56:16 [INFO] raft: Node at 127.0.0.1:8088 [Candidate] entering Candidate state
[metastore] 2015/08/18 06:56:16 [DEBUG] raft: Votes needed: 1
[metastore] 2015/08/18 06:56:16 [DEBUG] raft: Vote granted. Tally: 1
[metastore] 2015/08/18 06:56:16 [INFO] raft: Election won. Tally: 1
[metastore] 2015/08/18 06:56:16 [INFO] raft: Node at 127.0.0.1:8088 [Leader] entering Leader state
[metastore] 2015/08/18 06:56:16 [INFO] raft: Disabling EnableSingleNode (bootstrap)
@jwilder
Copy link
Contributor

jwilder commented Aug 18, 2015

Can you try a nightly build? This might have been fixed by 089d94

@deric
Copy link
Author

deric commented Aug 18, 2015

@jwilder I'll try the nightly build. It seems a bit random issue, I'm not sure if I'm able to reproduce similar situation.

@beckettsean
Copy link
Contributor

cannot reproduce

@aheusingfeld
Copy link

@deric did you solve your issue? I seem to have exactly the same problem with the 0.9.3 deb package running in the https://registry.hub.docker.com/u/tutum/influxdb/ container having the /data/meta folder mounted onto my case-insensitive OS X filesystem.

@beckettsean Will try to build an image where it is reproducible

@deric
Copy link
Author

deric commented Sep 2, 2015

I've upgraded to some nightly release, lost most of my data (that release was behaving really strangely, parts of timeseries were missing). Anyway now I'm running on 0.9.3-rc3 and the same error haven't occurred yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants