-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memberlist: Use separate queue for CAS messages and only wait for CAS messages queue to be empty when stopping #539
Conversation
Make wait timeout on shutdown configurable.
@@ -1163,32 +1193,25 @@ func (m *KV) processValueUpdate(workerCh <-chan valueUpdate, key string) { | |||
} | |||
} | |||
|
|||
func (m *KV) queueBroadcast(key string, content []string, version uint, message []byte) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inlined into broadcastNewValue
…t, to make it less flaky,
@@ -718,7 +718,7 @@ func testMultipleClientsWithConfigGenerator(t *testing.T, members int, configGen | |||
|
|||
startTime := time.Now() | |||
firstKv := clients[0] | |||
ctx, cancel := context.WithTimeout(context.Background(), casInterval*3/2) // Watch for 1.5 cas intervals. | |||
ctx, cancel := context.WithTimeout(context.Background(), casInterval*3) // Watch for 3x cas intervals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is unrelated, only done to avoid test flakiness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A nice improvement.
@@ -1335,7 +1358,7 @@ func (m *KV) MergeRemoteState(data []byte, _ bool) { | |||
level.Error(m.logger).Log("msg", "failed to store received value", "key", kvPair.Key, "err", err) | |||
} else if newver > 0 { | |||
m.notifyWatchers(kvPair.Key) | |||
m.broadcastNewValue(kvPair.Key, change, newver, codec) | |||
m.broadcastNewValue(kvPair.Key, change, newver, codec, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These messages we're persisting to the local KV that come from external broadcasts are non-CAS by definition? (Only the local mutations are called CAS?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Currently only CAS operation can modify KV store. There is also Delete operation in KV client interface, but memberlist implementation doesn't support it yet. But we should implement it eventually, so perhaps it would be better to call it "local updates", instead of "cas updates". WDYT? I think I'll rename it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! CAS is an implementation detail and doesn't really explain that it's a local mod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done this change in 979c031. I've also updated memberlist_client_messages_in_broadcast_queue
metric to report values for both queues individually.
Split messages_in_broadcast_queue metric into two values, one for each queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, LGTM.
What this PR does:
This PR changes memberlist client to send out CAS updates faster, and to reduce chance of dropping CAS updates when memberlist is stopping.
Using 2 separate queues avoids optimization performed by the queue, when subsequent updates affecting the same "change" (eg. instances in case of instance ring) can be merged together. However CAS updates typically modify different parts of the key (ie. different instances in the instance ring) than incoming gossip messages, so this doesn't look like a problem.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]