Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix panic when reporting shard diagnostics #2430

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion cmd/influxd/server_integration_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1462,9 +1462,13 @@ func TestSingleServerDiags(t *testing.T) {
config.Monitoring.Enabled = true
config.Monitoring.WriteInterval = main.Duration(100 * time.Millisecond)
nodes := createCombinedNodeCluster(t, testName, dir, 1, config)
createDatabase(t, testName, nodes, "mydb")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these changes here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, OK, to repro the error.

Well, let me run your code, and I'll see if I can get answers to my questions below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we need to to actually generate some shard data for the diag module to process it. Before this change the server diagnostic was just processing some basic information but not shard data

createRetentionPolicy(t, testName, nodes, "mydb", "myrp", len(nodes))
defer nodes.Close()

time.Sleep(1 * time.Second)
// Write some data to create shards
mergeMany(t, nodes[0], "mydb", "myrp")

}

func TestSingleServer(t *testing.T) {
Expand Down
10 changes: 8 additions & 2 deletions server.go
Original file line number Diff line number Diff line change
Expand Up @@ -3335,6 +3335,7 @@ func (s *Server) DiagnosticsAsRows() []*influxql.Row {
shardsRow.Name = "shards_diag"
shardsRow.Columns = append(shardsRow.Columns, "time", "id", "dataNodes", "index")
shardsRow.Tags = tags
var hasPath bool
for _, sh := range s.shards {
var nodes []string
for _, n := range sh.DataNodeIDs {
Expand All @@ -3345,9 +3346,14 @@ func (s *Server) DiagnosticsAsRows() []*influxql.Row {

// Shard may not be local to this node.
if sh.store != nil {
shardsRow.Columns = append(shardsRow.Columns, "path")
shardsRow.Values[0] = append(shardsRow.Values[0], sh.store.Path())
if !hasPath {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary. If a sh.store is non-nil, then it must have a path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@otoolep but you iterate through each shard. The hasPath variable controls if if the path column has been added before or we need to add it as it's the first time it appears. Otherwise the path column get's appended multiple times unnecessarily

shardsRow.Columns = append(shardsRow.Columns, "path")
hasPath = true
}
index := len(shardsRow.Values) - 1 // Current shardsRow.Values index
shardsRow.Values[index] = append(shardsRow.Values[index], sh.store.Path())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is right, but I might be missing something. shardsRow.Values[0] is a slice -- the code wants to append to this slice, not add another distinct entry to shardsRow.Values.

Can you show some raw data to explain what you are trying to fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, before my change the server was doing something like this:

[2015-04-25 01:15:03.956810841 +0000 UTC 1 1 550 /home/marcos/.influxdb/data/shards/1 /home/marcos/.influxdb/data/shards/2] map[id:1 dataNodes:2 index:3 path:5]
[2015-04-25 01:15:03.956810841 +0000 UTC 2 1 221] map[index:3 path:5 id:1 dataNodes:2]

And it panicked because the second row didn't contain a path.

After my change is doing this

[2015-04-25 01:15:03.956810841 +0000 UTC 1 1 550 /home/marcos/.influxdb/data/shards/1] map[id:1 dataNodes:2 index:3 path:4]
[2015-04-25 01:15:03.956810841 +0000 UTC 2 1 221 /home/marcos/.influxdb/data/shards/2] map[index:3 path:4 id:1 dataNodes:2]

My understanding by looking at the code is that the shard_diag row should contain one row per shard, and previously it was appending all the shards in the first row

}

}

return []*influxql.Row{
Expand Down