-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The etcd client does not use auto-sync and fails when the PD cluster address changed #8812
Labels
affects-6.1
affects-6.5
affects-7.1
area/ticdc
Issues or PRs related to TiCDC.
severity/major
type/bug
The issue is confirmed as a bug.
Comments
sdojjy
added
type/bug
The issue is confirmed as a bug.
area/ticdc
Issues or PRs related to TiCDC.
labels
Apr 20, 2023
TiCDC may shutdown or get stuck in this scenario. // campaign to be an owner.
func (c *captureImpl) campaign(ctx context.Context) error {
failpoint.Inject("capture-campaign-compacted-error", func() {
failpoint.Return(errors.Trace(mvcc.ErrCompacted))
})
// TODO: `Campaign` will get stuck when send SIGSTOP to pd leader.
// For `Campaign`, when send SIGSTOP to pd leader, cdc maybe call `cancel`
// (cause by `processor routine` exit). And inside `Campaign`, the routine
// return from `waitDeletes`(https://github.com/etcd-io/etcd/blob/main/client/v3/concurrency/election.go#L93),
// then call `Resign`(note: use `client.Ctx`) to etcd server. But the etcd server
// (the client connects to) has entered the STOP state, which means that
// the server cannot process the request, but will still maintain the GRPC
// connection. So `routine` will block 'Resign'.
return cerror.WrapError(cerror.ErrCaptureCampaignOwner, c.election.campaign(ctx, c.info.ID))
} |
ref: pingcap/tidb#42643 |
This was referenced May 6, 2023
ti-chi-bot
pushed a commit
to ti-chi-bot/tiflow
that referenced
this issue
May 6, 2023
16 tasks
17 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
affects-6.1
affects-6.5
affects-7.1
area/ticdc
Issues or PRs related to TiCDC.
severity/major
type/bug
The issue is confirmed as a bug.
reproduce step
Start a TiDB cluster with 3 PDs ① ② ③ and a ticdc connected
Scale-out 3 more PDs ④ ⑤ ⑥
Wait 31 seconds
Scale-in the original PDs ① ② ③
ticdc is restarted.
The text was updated successfully, but these errors were encountered: