Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-node cluster replication is racy (testing) #1556

Closed
otoolep opened this issue Feb 10, 2015 · 3 comments
Closed

Multi-node cluster replication is racy (testing) #1556

otoolep opened this issue Feb 10, 2015 · 3 comments
Assignees

Comments

@otoolep
Copy link
Contributor

otoolep commented Feb 10, 2015

Replication to a 3-node and 5-node cluster appears to be racy, so that test is currently disabled in the integration testing. 3-node does not appear to be racy locally, but both are racy in travis.

https://github.com/influxdb/influxdb/blob/master/cmd/influxd/server_integration_test.go

This was referenced Feb 10, 2015
otoolep added a commit that referenced this issue Feb 26, 2015
Because these messages were being sent to replicas (data nodes) data
nodes were sometimes receiving messages out-of-order from the broker,
where order is defined strictly by the Raft log index. This causes our
tests to fail, because the new "wait" endpoint on broker assumes this
can never happen. When told to wait for, say, index 23, but sees it is
at 25, it responds with 200. But the message with ID 23 has yet to
arrive.

This change also means that if unknown messages are seen by data nodes,
they will instead panic, which should flag this issue much sooner.

Fixes issue #1556
@otoolep otoolep self-assigned this Feb 26, 2015
@otoolep
Copy link
Contributor Author

otoolep commented Feb 26, 2015

@benbjohnson @corylanou

So here is a clearer example of what is going on. A 5 node cluster is created, and all nodes join. A large batch write is then sent to the node listening on port 8590. When the test finishes the node listening on port 8591 does not have the data, even though its index implies that it should.

I patched the code simply as follows:

--- a/server.go
+++ b/server.go
@@ -2606,6 +2606,8 @@ func (s *Server) processor(client MessagingClient, done chan struct{}) {
                        }
                }

+               fmt.Printf("Server %s processing message type 0x%x, index %d->%d\n", s.path, m.Type, s.index, m.Index)
+
                // Handle write series separately so we don't lock server during shard writes.
                if m.Type == writeRawSeriesMessageType {
                        // Write series to shard without lock.

And see this output on 8591:

Server /tmp/data-integration-test/8591 processing message type 0x8010, index 5->6
Server /tmp/data-integration-test/8591 processing message type 0x8000, index 6->7
Server /tmp/data-integration-test/8591 processing message type 0x0, index 7->8
Server /tmp/data-integration-test/8591 processing message type 0x8010, index 8->9
Server /tmp/data-integration-test/8591 processing message type 0x8000, index 9->10
Server /tmp/data-integration-test/8591 processing message type 0x0, index 10->11
Server /tmp/data-integration-test/8591 processing message type 0x8010, index 11->12
Server /tmp/data-integration-test/8591 processing message type 0x8000, index 12->13
Server /tmp/data-integration-test/8591 processing message type 0x0, index 13->14
Server /tmp/data-integration-test/8591 processing message type 0x8010, index 14->15
Server /tmp/data-integration-test/8591 processing message type 0x10, index 15->16
Server /tmp/data-integration-test/8591 processing message type 0x20, index 16->17
Server /tmp/data-integration-test/8591 processing message type 0x23, index 17->18
Server /tmp/data-integration-test/8591 processing message type 0x60, index 18->19
Server /tmp/data-integration-test/8591 processing message type 0x40, index 19->20
Server /tmp/data-integration-test/8591 processing message type 0x8020, index 20->21
Server /tmp/data-integration-test/8591 processing message type 0x8020, index 21->22
Server /tmp/data-integration-test/8591 processing message type 0x8020, index 22->23
Server /tmp/data-integration-test/8591 processing message type 0x8020, index 23->25
Server /tmp/data-integration-test/8591 processing message type 0x80, index 25->24

Note how the last message that 8591 receives is out of order, and is preceded by two broker-only messages of SubscribeMessageType. Message type 0x80 is writeRawSeriesMessageType.

So this is what is happening, as far as I can see.

  • node 8590 receives the write.
  • 8590 creates the measurements in a sync'ed message to the brokers.
  • 8590 creates the shard groups in a sync'ed message to the brokers.
  • 8590 sends the series data to the broker. This is not sync'ed.
  • 8591 gets the measurements message and processes it.
  • 8591 gets the shard groups message, processes it, and as a result issues a subscribe command to the broker, for the relevant topic. Note that 8591 has not received any series data yet since it is not subscribed to any.
  • Before 8591's subscription completes the series data is sent to the replicas currently subscribed to that topic. At this time 8591 is not subscribed, and doesn't get the data.
  • 8591 does receive messages in the Raft log after the writes series message.
  • 8591 completes its subscription, and then gets sent the writes series data which has an index lower than the messages it receives in the step immediately above.

Because the implementation of the wait endpoint assumes messages never arrive out-of-order, wait breaks and gives the wrong answer. This is definitely an edge case, but it's exactly the way testing works. If we had some way to ensure that nodes are fully subscribed to their topics before sending data, it would be helpful in this testing.

@otoolep
Copy link
Contributor Author

otoolep commented Feb 26, 2015

@benbjohnson has confirmed this is a real issue, that needs some thought.

@otoolep
Copy link
Contributor Author

otoolep commented Mar 27, 2015

All fixed!

@otoolep otoolep closed this as completed Mar 27, 2015
mark-rushakoff pushed a commit that referenced this issue Jan 11, 2019
docs(http): update authorization definition
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant