-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: improve handling for removal of a replica, when multiple replicas already exist on the same node #60546
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but @aayushshah15 should also sign off. Thanks for the quick patch.
How far back will we need to backport this? Also, if we're going to then we should add a release note.
Reviewed 2 of 2 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15)
IIUC bug should only affect 20.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the backport to 20.2 may or may not be clean, my apologies for that in advance.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @lunevalex)
pkg/kv/kvserver/replica_command.go, line 1153 at r1 (raw file):
} func validateReplicationChangesMultipleReplicasOnTheSameStore(
s/OnTheSameStore/OnTheSameNode
pkg/kv/kvserver/replica_command.go, line 1208 at r1 (raw file):
} // Then, check that we're not adding a second replica on nodes that already
should this be s/nodes/stores
?
pkg/kv/kvserver/replica_command.go, line 1211 at r1 (raw file):
// have one, or "re-add" an existing replica. We delete from byNodeAndStoreID so that // after this loop, it contains only Nodes that we haven't seen in desc. for nodeID, descs := range descriptorsByNodeID {
should we rename descs
to something like rDescsForNode
, just to be very explicit?
pkg/kv/kvserver/replica_command.go, line 1217 at r1 (raw file):
} delete(byNodeAndStoreID, nodeID) // The only valid thing to do when we already have multiple replicas on the
I'm having a bit of a hard time convincing myself that this is always true. When there's a (VOTER
, LEARNER
) pair on the same node, you're right that we'd expect the allocator to always try to remove that LEARNER
, but I'm not sure if we want to make that assumption here?
For instance, the patch below produces an error, should it? I haven't tested this out, but the situation in the patch should be possible if the user intentionally / inadvertently constrains, say, 3 replicas to a node that has 3 stores.
Index: pkg/kv/kvserver/replica_command_test.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pkg/kv/kvserver/replica_command_test.go b/pkg/kv/kvserver/replica_command_test.go
--- a/pkg/kv/kvserver/replica_command_test.go (revision 688740b898700cd8ea5cd9db3ff0def457b83505)
+++ b/pkg/kv/kvserver/replica_command_test.go (date 1613169923000)
@@ -198,7 +198,7 @@
InternalReplicas: []roachpb.ReplicaDescriptor{
{NodeID: 1, StoreID: 1},
{NodeID: 2, StoreID: 2},
- {NodeID: 1, StoreID: 2, Type: &learnerType},
+ {NodeID: 1, StoreID: 3, Type: &learnerType},
},
}
err = validateReplicationChanges(descRebalancing, roachpb.ReplicationChanges{
@@ -208,7 +208,12 @@
// Test Case 15: same as 14 but remove the second node
err = validateReplicationChanges(descRebalancing, roachpb.ReplicationChanges{
- {ChangeType: roachpb.REMOVE_VOTER, Target: roachpb.ReplicationTarget{NodeID: 1, StoreID: 2}},
+ {ChangeType: roachpb.REMOVE_VOTER, Target: roachpb.ReplicationTarget{NodeID: 1, StoreID: 3}},
+ })
+ require.NoError(t, err)
+
+ err = validateReplicationChanges(descRebalancing, roachpb.ReplicationChanges{
+ {roachpb.ADD_VOTER, roachpb.ReplicationTarget{NodeID: 1, StoreID: 4}},
})
require.NoError(t, err)
pkg/kv/kvserver/replica_command_test.go, line 201 at r1 (raw file):
{NodeID: 1, StoreID: 1}, {NodeID: 2, StoreID: 2}, {NodeID: 1, StoreID: 2, Type: &learnerType},
nit: we should probably have 3 distinct store ids here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats right only 20.2, good call on release note adding.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15)
pkg/kv/kvserver/replica_command.go, line 1153 at r1 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
s/OnTheSameStore/OnTheSameNode
ack
pkg/kv/kvserver/replica_command.go, line 1208 at r1 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
should this be
s/nodes/stores
?
we check for both and return a slightly different error message in each case
pkg/kv/kvserver/replica_command.go, line 1211 at r1 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
should we rename
descs
to something likerDescsForNode
, just to be very explicit?
ack
pkg/kv/kvserver/replica_command.go, line 1217 at r1 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
I'm having a bit of a hard time convincing myself that this is always true. When there's a (
VOTER
,LEARNER
) pair on the same node, you're right that we'd expect the allocator to always try to remove thatLEARNER
, but I'm not sure if we want to make that assumption here?For instance, the patch below produces an error, should it? I haven't tested this out, but the situation in the patch should be possible if the user intentionally / inadvertently constrains, say, 3 replicas to a node that has 3 stores.
Index: pkg/kv/kvserver/replica_command_test.go IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== diff --git a/pkg/kv/kvserver/replica_command_test.go b/pkg/kv/kvserver/replica_command_test.go --- a/pkg/kv/kvserver/replica_command_test.go (revision 688740b898700cd8ea5cd9db3ff0def457b83505) +++ b/pkg/kv/kvserver/replica_command_test.go (date 1613169923000) @@ -198,7 +198,7 @@ InternalReplicas: []roachpb.ReplicaDescriptor{ {NodeID: 1, StoreID: 1}, {NodeID: 2, StoreID: 2}, - {NodeID: 1, StoreID: 2, Type: &learnerType}, + {NodeID: 1, StoreID: 3, Type: &learnerType}, }, } err = validateReplicationChanges(descRebalancing, roachpb.ReplicationChanges{ @@ -208,7 +208,12 @@ // Test Case 15: same as 14 but remove the second node err = validateReplicationChanges(descRebalancing, roachpb.ReplicationChanges{ - {ChangeType: roachpb.REMOVE_VOTER, Target: roachpb.ReplicationTarget{NodeID: 1, StoreID: 2}}, + {ChangeType: roachpb.REMOVE_VOTER, Target: roachpb.ReplicationTarget{NodeID: 1, StoreID: 3}}, + }) + require.NoError(t, err) + + err = validateReplicationChanges(descRebalancing, roachpb.ReplicationChanges{ + {roachpb.ADD_VOTER, roachpb.ReplicationTarget{NodeID: 1, StoreID: 4}}, }) require.NoError(t, err)
We don't make that assumption here, Test Case 14 tests removing the voter and Test Case 15 tests removing the learner, either is allowed.
pkg/kv/kvserver/replica_command_test.go, line 201 at r1 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
nit: we should probably have 3 distinct store ids here.
ack
688740b
to
3b0a29b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @lunevalex, and @nvanbenschoten)
pkg/kv/kvserver/replica_command.go, line 1217 at r1 (raw file):
Previously, lunevalex wrote…
We don't make that assumption here, Test Case 14 tests removing the voter and Test Case 15 tests removing the learner, either is allowed.
I had the wrong impression that nodes with multiple stores aren't disallowed to have multiple replicas. This all makes sense now. The scenario in my patch above shouldn't be possible.
7226b9a
to
40c0524
Compare
…plicas already exist on the same node Fixes cockroachdb#60545 The allocator in some cases allows for a range to have a replica on multiple stores of the same node. If that happens, it should allow itself to fix the situation by removing one of the offending replicas. This was only half working due to an ordering problem in how the replicas appeared in the descriptor. It could remove the first replica, but not the second one. Release note (bug fix): 20.2 introduced an ability to rebalance replicas between multiple stores on the same node. This change fixed a problem with that feature, where ocassionaly an intra-node rebalance would fail and a range would get stuck permanently under replicated.
40c0524
to
306d2e9
Compare
bors r+ |
Build succeeded: |
60633: release-20.2: kvserver: improve handling for removal of a replica, when multiple replicas already exist on the same node r=aayushshah15 a=lunevalex Backport 1/1 commits from #60546. /cc @cockroachdb/release --- Fixes #60545 The allocator in some cases allows for a range to have a replica on multiple stores of the same node. If that happens, it should allow itself to fix the situation by removing one of the offending replicas. This was only half working due to an ordering problem in how the replicas appeared in the descriptor. It could remove the first replica, but not the second one. . Release note: None Co-authored-by: Alex Lunev <alexl@cockroachlabs.com>
Fixes #60545
The allocator in some cases allows for a range to have a replica
on multiple stores of the same node. If that happens, it should allow
itself to fix the situation by removing one of the offending replicas.
This was only half working due to an ordering problem in how the replicas
appeared in the descriptor. It could remove the first replica, but not the second one.
.
Release note: None