Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dr-autosync]v6.5 changefeed checkpoint ts lag up to 20min after down backup dc #9532

Closed
mayjiang0203 opened this issue Aug 9, 2023 · 7 comments

Comments

@mayjiang0203
Copy link

mayjiang0203 commented Aug 9, 2023

What did you do?

1.Deploy one cluster and set it in dr-autosync mode, with 3 pd, 3 tikv, 1 ticdc in primay datacenter. and 2 pd, 3 tikv, 1 ticdc in backup datacenter.
2.Down backup dc, including 2 pd, 1 ticdc and 3 tikvs.

What did you expect to see?

Changefeeds should be able to function properly without any influences.

What did you see instead?

image

image

image

image

Versions of the cluster

Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

(paste TiDB cluster version here)

Upstream TiKV version (execute tikv-server --version):

(paste TiKV version here)

[root@tikv3-0 ~]# /tiup/deploy/tikv-20160/bin/tikv-server -V
TiKV
Release Version: 6.5.3
Edition: Community
Git Commit Hash: cb0dab3e4e0f3081966e871057272a40aa7ffc2d
Git Commit Branch: heads/refs/tags/v6.5.3-flashback
UTC Build Time: 2023-08-01 05:02:02
Rust Version: rustc 1.67.0-nightly (96ddd32c4 2022-11-14)
Enable Features: pprof-fp jemalloc mem-profiling portable sse test-engine-kv-rocksdb test-engine-raft-raft-engine cloud-aws cloud-gcp cloud-azure
Profile: dist_release


TiCDC version (execute `cdc version`):

[root@ticdc1-0 ~]# /tiup/deploy/cdc-8300/bin/cdc version
Release Version: v6.5.3-drautosync
Git Commit Hash: aa97333
Git Branch: heads/refs/tags/v6.5.3-drautosync
UTC Build Time: 2023-07-24 06:27:18
Go Version: go version go1.19.11 linux/amd64
Failpoint Build: false
[root@ticdc1-0 ~]#


```console
(paste TiCDC version here)

Downstream version

/ # /tikv-server -V
TiKV 
Release Version:   6.5.3
Edition:           Community
Git Commit Hash:   fd5f88a7fdda1bf70dcb0d239f60137110c54d46
Git Commit Branch: heads/refs/tags/v6.5.3
UTC Build Time:    2023-06-09 10:55:15
Rust Version:      rustc 1.67.0-nightly (96ddd32c4 2022-11-14)
Enable Features:   pprof-fp jemalloc mem-profiling portable sse test-engine-kv-rocksdb test-engine-raft-raft-engine cloud-aws cloud-gcp cloud-azure
Profile:           dist_release
@mayjiang0203 mayjiang0203 added area/ticdc Issues or PRs related to TiCDC. type/bug The issue is confirmed as a bug. labels Aug 9, 2023
@fubinzh
Copy link

fubinzh commented Aug 10, 2023

/severity major

@nongfushanquan
Copy link
Contributor

what does the " dr-autosync" mode mean? if the downstream crash, why should the lag not be affected ?

@fubinzh
Copy link

fubinzh commented Aug 11, 2023

@nongfushanquan for dr-autosync you can refer to https://docs.pingcap.com/tidb/stable/two-data-centers-in-one-city-deployment. Backup dc is not downstream, it is the "DR DC" in dr-autosync topo I believe.

@mayjiang0203
Copy link
Author

/assign @asddongmen

@mayjiang0203
Copy link
Author

image

Didn't hit this recently in v6.5.0 testing, after down backup dc for more than 5 hours , lag still less than 10s.
So close this issue.

@mayjiang0203
Copy link
Author

mayjiang0203 commented Aug 18, 2023

Reopen it for no related fix recently, set it to moderate and wait for it happens again.
/severity moderate

@asddongmen
Copy link
Contributor

Close it since it's stale, if you see it again , feel free to open it and please contact me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

4 participants