-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add max wait count for transfer leader operator. #147
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,6 +21,12 @@ import ( | |
"github.com/pingcap/kvproto/pkg/raftpb" | ||
) | ||
|
||
const ( | ||
// TODO: we can make this as a config flag. | ||
// maxWaitCount is the heartbeat count when we check whether the operator is successful. | ||
maxWaitCount = 3 | ||
) | ||
|
||
// Operator is the interface to do some operations. | ||
type Operator interface { | ||
// Do does the operator, if finished then return true. | ||
|
@@ -176,16 +182,19 @@ func (co *ChangePeerOperator) Do(region *metapb.Region, leader *metapb.Peer) (bo | |
|
||
// TransferLeaderOperator is used to do leader transfer. | ||
type TransferLeaderOperator struct { | ||
mustSuccess bool | ||
count int | ||
maxWaitCount int | ||
|
||
oldLeader *metapb.Peer | ||
newLeader *metapb.Peer | ||
} | ||
|
||
func newTransferLeaderOperator(oldLeader, newLeader *metapb.Peer) *TransferLeaderOperator { | ||
func newTransferLeaderOperator(oldLeader, newLeader *metapb.Peer, waitCount int) *TransferLeaderOperator { | ||
return &TransferLeaderOperator{ | ||
oldLeader: oldLeader, | ||
newLeader: newLeader, | ||
oldLeader: oldLeader, | ||
newLeader: newLeader, | ||
count: 0, | ||
maxWaitCount: waitCount, | ||
} | ||
} | ||
|
||
|
@@ -220,16 +229,21 @@ func (lto *TransferLeaderOperator) Do(region *metapb.Region, leader *metapb.Peer | |
return true, nil, nil | ||
} | ||
|
||
// If lto.mustSuccess is true, then lto.check should always be true. | ||
if lto.mustSuccess { | ||
return false, nil, errors.Errorf("transfer leader operator called twice - %v", lto) | ||
// If lto.count is greater than 0, then we should check whether it exceeds the lto.maxWaitCount. | ||
if lto.count > 0 { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If some one set maxWaitCount to zero, you will get infinite loop. So it is better to add a check in newTransferLeaderOperator or here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, later we will move it as config flag. |
||
if lto.count >= lto.maxWaitCount { | ||
return false, nil, errors.Errorf("transfer leader operator called %d times but still be unsucceessful - %v", lto.count, lto) | ||
} | ||
|
||
lto.count++ | ||
return false, nil, nil | ||
} | ||
|
||
res := &pdpb.RegionHeartbeatResponse{ | ||
TransferLeader: &pdpb.TransferLeader{ | ||
Peer: lto.newLeader, | ||
}, | ||
} | ||
lto.mustSuccess = true | ||
lto.count++ | ||
return false, res, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value of count is already zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is a variable like index, we should update it in Do function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
count
andmaxWaitCount
confuse me.you can initialize count as waitCount and then use
lto.count < 0 lt.count--
to check.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems not, we must store the origin
maxWaitCount
to make sure the first time we return the transfer leader response.If we only use
count
, then we cannot distinct it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.