Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v0.10.0]: unexpected "panic: runtime error: index out of range" during restart on v0.10.0 #5772

Closed
jonseymour opened this issue Feb 21, 2016 · 12 comments
Milestone

Comments

@jonseymour
Copy link
Contributor

The panic below happens, but only when I revert the local server to one running v0.10.0 after previously running one from master and causing a crash (see #5773).

[update: the fact that master crashed is actually irrelevant - merely starting a "master" version of influx and then reverting to v0.10.0 is sufficient to cause the issue]

I have seen this condition occur 3 times, have 3 small databases which fail when running on v0.10.0 after a crash that occurred running in master.

greywedge2:influxd jonseymour (2)$ ./influxd -config=config.toml

 8888888           .d888 888                   8888888b.  888888b.
   888            d88P"  888                   888  "Y88b 888  "88b
   888            888    888                   888    888 888  .88P
   888   88888b.  888888 888 888  888 888  888 888    888 8888888K.
   888   888 "88b 888    888 888  888  Y8bd8P' 888    888 888  "Y88b
   888   888  888 888    888 888  888   X88K   888    888 888    888
   888   888  888 888    888 Y88b 888 .d8""8b. 888  .d88P 888   d88P
 8888888 888  888 888    888  "Y88888 888  888 8888888P"  8888888P"

2016/02/21 21:09:37 InfluxDB starting, version 0.9, branch unknown, commit unknown, built unknown
2016/02/21 21:09:37 Go version go1.5.3, GOMAXPROCS set to 4
2016/02/21 21:09:37 Using configuration at: config.toml
[meta] 2016/02/21 21:09:37 Starting meta service
[meta] 2016/02/21 21:09:37 Listening on HTTP: 127.0.0.1:8091
[metastore] 2016/02/21 21:09:37 Using data dir: /Users/jonseymour/.influxdb/meta
[metastore] 2016/02/21 21:09:37 Node at localhost:8088 [Follower]
[metastore] 2016/02/21 21:09:38 Node at localhost:8088 [Leader]. peers=[localhost:8088]
panic: runtime error: index out of range

goroutine 1 [running]:
github.com/influxdb/influxdb/services/meta.(*Client).retryUntilSnapshot(0xc82021a080, 0x0, 0xc82018f500)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/services/meta/client.go:1127 +0x3cb
github.com/influxdb/influxdb/services/meta.(*Client).Open(0xc82021a080, 0x0, 0x0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/services/meta/client.go:92 +0xb6
github.com/influxdb/influxdb/cmd/influxd/run.(*Server).initializeMetaClient(0xc820001c80, 0x0, 0x0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/cmd/influxd/run/server.go:626 +0xda
github.com/influxdb/influxdb/cmd/influxd/run.(*Server).Open(0xc820001c80, 0x0, 0x0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/cmd/influxd/run/server.go:408 +0x606
github.com/influxdb/influxdb/cmd/influxd/run.(*Command).Run(0xc8200803f0, 0xc82000a2b0, 0x1, 0x1, 0x0, 0x0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/cmd/influxd/run/command.go:125 +0xe34
main.(*Main).Run(0xc8200fff00, 0xc82000a2b0, 0x1, 0x1, 0x0, 0x0)
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/cmd/influxd/main.go:86 +0x6b9
main.main()
    /Users/jonseymour/ninja/go/src/github.com/influxdb/influxdb/cmd/influxd/main.go:46 +0x29b


@zstyblik
Copy link

@jonseymour

2016/02/21 21:09:37 InfluxDB starting, version 0.9

V.

if I start up influx on v0.10.0

Which version is it then?

@jonseymour
Copy link
Contributor Author

v0.10.0 - I know because I built it myself.

@zstyblik
Copy link

@jonseymour ok, thank you for clarification.

@jonseymour
Copy link
Contributor Author

An example of a panic in master which causes this problem when reverting to v0.10.0 is #5773

@jonseymour
Copy link
Contributor Author

@zstyblik admittedly, accepting #5763 would help remove such confusion.

@jonseymour jonseymour changed the title [master]: unexpected panic during restart on master [master]: unexpected "panic: runtime error: index out of range" during restart on master Feb 21, 2016
@jonseymour jonseymour changed the title [master]: unexpected "panic: runtime error: index out of range" during restart on master [v0.10.0]: unexpected "panic: runtime error: index out of range" during restart on v0.10.0 Feb 21, 2016
@jonseymour
Copy link
Contributor Author

Ok, looks like this was fixed with 6fb00c1 in 0.11.0. Closing for now.

@jonseymour
Copy link
Contributor Author

The apparent reason for the failure in this scenario is that upon migration to 0.11.0, the MetaServers property is removed from nodes.json. When reverting back to 0.10.0, this property is no longer present and this exercises a code path in the v0.10.0 which is not expecting this property to be missing or empty.

greywedge2:~ jonseymour (2)$ diff .influxdb-after-v0.10.1/meta/node.json .influxdb/meta/node.json 
1c1
< {"ID":1,"MetaServers":["localhost:8091"]}
---
> {"ID":1}

The workaround was to restore the deleted property to node.json and restart the server again.

I can see this might cause issues for people migrating from 0.10.0 to 0.11.0 if they switch back to 0.10.0 for some reason - their server will no longer start and they will have to find this issue in order to discover the workaround.

Left open for review by @jwilder. Feel free to close it once you have considered the issue.

@zstyblik
Copy link

@jonseymour in that case, I'd suggest restore from backup. But perhaps it'd be worth of documenting/noting there is a breaking change between v0.10 and v0.11.

@jonseymour
Copy link
Contributor Author

@zstyblik, agreed.

@joelegasse
Copy link
Contributor

/cc @beckettsean @pauldix This sounds like something that should be in the docs/announcement for 0.11. I don't think downgrading was ever an officially supported, but with support for b1 and bz1 shards disappearing, and lingering questions about upgrade paths from 0.9, this might be something to investigate and be able to have a clear message about in the release notes/announcement.

@pauldix
Copy link
Member

pauldix commented Mar 21, 2016

People do generally have the expectation that they'll be able to downgrade if for some reason the upgrade doesn't work well for them. Although maybe it'll be enough to document and tell people to fully back up if they want to be able to downgrade?

@jwilder
Copy link
Contributor

jwilder commented Mar 21, 2016

This should be fixed via #5957. There shouldn't be an issue rolling back to 0.10 from 0.11 now.

@jwilder jwilder closed this as completed Mar 21, 2016
@jwilder jwilder added this to the 0.11.0 milestone Mar 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants