Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

close of closed channel #5894

Closed
26618929 opened this issue Mar 3, 2016 · 4 comments
Closed

close of closed channel #5894

26618929 opened this issue Mar 3, 2016 · 4 comments

Comments

@26618929
Copy link

26618929 commented Mar 3, 2016

OS: linux 2.6.32-573.18.1.el6.x86_64
influxdb: v0.10.1
This kind of situation happen in once of every hours

[retention] 2016/03/03 10:08:26 retention policy shard deletion check commencing
panic: close of closed channel

goroutine 304 [running]:
github.com/influxdb/influxdb/tsdb/engine/tsm1.(_Engine).Close(0xc2088b9040, 0x0, 0x0)
/tmp/tmp.n7AHL1nb9U/src/github.com/influxdb/influxdb/tsdb/engine/tsm1/engine.go:147 +0x3e
github.com/influxdb/influxdb/tsdb.(_Shard).close(0xc209e9ef20, 0x0, 0x0)
/tmp/tmp.n7AHL1nb9U/src/github.com/influxdb/influxdb/tsdb/shard.go:150 +0x54
github.com/influxdb/influxdb/tsdb.(_Shard).Close(0xc209e9ef20, 0x0, 0x0)
/tmp/tmp.n7AHL1nb9U/src/github.com/influxdb/influxdb/tsdb/shard.go:145 +0x89
github.com/influxdb/influxdb/tsdb.(_Store).DeleteShard(0xc20808c2c0, 0xbd, 0x0, 0x0)
/tmp/tmp.n7AHL1nb9U/src/github.com/influxdb/influxdb/tsdb/store.go:134 +0xe4
github.com/influxdb/influxdb/services/retention.(_Service).deleteShards(0xc20812f5c0)
/tmp/tmp.n7AHL1nb9U/src/github.com/influxdb/influxdb/services/retention/service.go:130 +0x8ab
created by github.com/influxdb/influxdb/services/retention.(_Service).Open
/tmp/tmp.n7AHL1nb9U/src/github.com/influxdb/influxdb/services/retention/service.go:45 +0x1b8

goroutine 1 [chan receive, 59 minutes]:
main.(*Main).Run(0xc20803b700, 0xc20800a010, 0x4, 0x4, 0x0, 0x0)
/tmp/tmp.n7AHL1nb9U/src/github.com/influxdb/influxdb/cmd/influxd/main.go:96 +0x7a1
main.main()
/tmp/tmp.n7AHL1nb9U/src/github.com/influxdb/influxdb/cmd/influxd/main.go:46 +0xdc

goroutine 6 [syscall, 60 minutes]:
os/signal.loop()
/root/.gvm/gos/go1.4.3/src/os/signal/signal_unix.go:21 +0x1f
created by os/signal.init¡¤1
/root/.gvm/gos/go1.4.3/src/os/signal/signal_unix.go:27 +0x35

goroutine 8 [IO wait, 29 minutes]:
net.(_pollDesc).Wait(0xc208010760, 0x72, 0x0, 0x0)
/root/.gvm/gos/go1.4.3/src/net/fd_poll_runtime.go:84 +0x47
net.(_pollDesc).WaitRead(0xc208010760, 0x0, 0x0)
/root/.gvm/gos/go1.4.3/src/net/fd_poll_runtime.go:89 +0x43
net.(_netFD).accept(0xc208010700, 0x0, 0x7fd40a589ee8, 0xc20ac3df50)
/root/.gvm/gos/go1.4.3/src/net/fd_unix.go:419 +0x40b
net.(_TCPListener).AcceptTCP(0xc20802c3a8, 0xc2080186d8, 0x0, 0x0)
/root/.gvm/gos/go1.4.3/src/net/tcpsock_posix.go:234 +0x4e
net.(_TCPListener).Accept(0xc20802c3a8, 0x0, 0x0, 0x0, 0x0)
/root/.gvm/gos/go1.4.3/src/net/tcpsock_posix.go:244 +0x4c
github.com/influxdb/influxdb/tcp.(_Mux).Serve(0xc20805c840, 0x7fd40a58c918, 0xc20802c3a8, 0x0, 0x0)
/tmp/tmp.n7AHL1nb9U/src/github.com/influxdb/influxdb/tcp/mux.go:52 +0xc7
created by github.com/influxdb/influxdb/cmd/influxd/run.(*Server).Open
/tmp/tmp.n7AHL1nb9U/src/github.com/influxdb/influxdb/cmd/influxd/run/server.go:396 +0x25e

@jonseymour
Copy link
Contributor

Could you post a gist of your influx log from the time the server starts until this error occurs?

To reduce size you can filter out the http and query entries with:

 egrep -v "\[(http|query)" < influx.log > influx.log.filtered

One reason this could happen in the version of the code you are running is if a previous attempt to delete the shard has failed perhaps because of a permissions issue on the filesystem containing the WAL and TSM files for the shard. The second attempt will fail in this way because the original attempt to delete and close the shard left the shard in a half-closed state.

See: https://github.com/influxdata/influxdb/blob/v0.10.0/tsdb/store.go#L124-L149

So, one thing you might double check is that all files and directories in the influx data and wal directories are fully writeable by the user running the influxd process. If this isn't true for some reason (e.g. influx had previously been started as the root user and is now running as a less privileged user), then this might be the reason for shard deletion issues which are eventually manifesting themselves with this panic.

@jwilder - I think this issue may not be entirely unrelated to an issue I highlighted in a trailing comment to #5784 regarding the need to nil out s.engine during Shard.Close().

@jonseymour
Copy link
Contributor

See also #5244 and #5866.

@26618929
Copy link
Author

26618929 commented Mar 3, 2016

thank you so much for your prompt reply.
I found a lot of "permission denied" in my influxdb.log with your provides ways
I has changed the dir of influxdb data's owner to influxdb and i will observer the services for some hours

@26618929
Copy link
Author

26618929 commented Mar 4, 2016

Thanks for your help.
It's work

@26618929 26618929 closed this as completed Mar 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants