-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TiCDC met replication interruption when multiple TiKVs crash or ungraceful restart. #3288
Labels
affects-4.0
affects-5.0
affects-5.1
affects-5.2
area/ticdc
Issues or PRs related to TiCDC.
component/kv-client
TiKV kv log client component.
severity/major
type/bug
The issue is confirmed as a bug.
Milestone
Comments
This was referenced Nov 5, 2021
overvenus
added
affects-4.0
affects-5.0
affects-5.1
affects-5.2
component/kv-client
TiKV kv log client component.
labels
Jan 4, 2022
ti-chi-bot
pushed a commit
that referenced
this issue
Jan 18, 2022
This was referenced Jan 18, 2022
overvenus
added a commit
to ti-chi-bot/tiflow
that referenced
this issue
Jan 18, 2022
overvenus
added a commit
to ti-chi-bot/tiflow
that referenced
this issue
Jan 19, 2022
ti-chi-bot
added a commit
that referenced
this issue
Jan 19, 2022
zhaoxinyu
pushed a commit
to zhaoxinyu/ticdc
that referenced
this issue
Jan 20, 2022
ti-chi-bot
added a commit
that referenced
this issue
Feb 21, 2022
ti-chi-bot
added a commit
that referenced
this issue
Apr 15, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
affects-4.0
affects-5.0
affects-5.1
affects-5.2
area/ticdc
Issues or PRs related to TiCDC.
component/kv-client
TiKV kv log client component.
severity/major
type/bug
The issue is confirmed as a bug.
What did you do?
we can reproduce as following steps
tiup cluster restart <cluster-name> -R tikv
command for example.By comparing the initialized regions in TiCDC and all regions by querying
select region_id from information_schema.tikv_region_status where db_name = 'xx' and table_name = 'yy'
from TiDB, we can observe some regions are lost.By querying the lost region id in TiCDC log, we found the region disconnected without reconnect
The root cause is kv client must recycle all failed regions, so we should use the root context of a kv client to call
onRegionFail
This bug tends to happen when multiple TiKVs crash or forcing restart, and based on existing test, one TiKV crashes or restarts doesn't trigger this bug. And the more regions, the higher probability.
What did you expect to see?
cdc runs normally
What did you see instead?
Some regions are missed and replication interrupt
Versions of the cluster
Upstream TiDB cluster version (execute
SELECT tidb_version();
in a MySQL client):v5.2.1
TiCDC version (execute
cdc version
):master@pingcap/ticdc@37bac66
The text was updated successfully, but these errors were encountered: