Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDC cloud: row lost when scale-out and scale-in #2244

Closed
Tammyxia opened this issue Jul 8, 2021 · 1 comment
Closed

CDC cloud: row lost when scale-out and scale-in #2244

Tammyxia opened this issue Jul 8, 2021 · 1 comment
Assignees
Labels
area/ticdc Issues or PRs related to TiCDC. severity/critical type/bug The issue is confirmed as a bug.

Comments

@Tammyxia
Copy link

Tammyxia commented Jul 8, 2021

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a recipe for reproducing the error.
  • Load data to upstream tidb: $ bin/go-ycsb run mysql -P workloads/betting -p operationcount=5000000 -p mysql.host=tidb.88d9efdd.50c5d2e.us-west-2.shared.aws.tidbcloud.com -p mysql.port=4000 --threads 200 -p dbnameprefix=testcc -p databaseproportions=1.0 -p unitnameprefix=unit1 -p unitscount=1 -p tablecount=100 -p mysql.password=12345678

  • During data loading, scale-in then scale out cdc:
    1. scale-in cdc: $kubectl edit tc xxx -> modify ticdc: replica from 3 to 1, so this process cdc owner switched.
    2. scale-out cdc: $kubectl edit tc xxx -> modify ticdc: replica from 1 to 3

  • Stop data loading, waiting for cdc sync task completing.

  • Check if data consistency with sync-diff, config.toml:
    log-level = "info"
    chunk-size = 1000000
    check-thread-count = 40
    sample-percent = 100
    use-checksum = true
    only-use-checksum = false
    use-checkpoint = true
    ignore-data-check = false
    ignore-struct-check = false
    fix-sql-file = "fix.sql"

[[check-tables]]
schema = "testcc0"
tables = ["~^"]

[[source-db]]
host = "xxx"
port = 4000
user = "root"
password = "xxx"
instance-id = "source-1"

[target-db]
host = "xxx"
port = 4000
user = "root"
password = "xxx"

  1. What did you expect to see?
  • data consistency
  1. What did you see instead?
  • sync_diff has many logs about: target lack of data, find different row. For example, compare row count between upstream and downstream:
    image
    image
    image
  1. Versions of the cluster

    • Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

      5. 0. 2
      
    • TiCDC version (execute cdc version):

      ["Welcome to Change Data Capture (CDC)"] [release-version=v5.0.0-dev] [git-hash=9300b05ceec2c4811198416a5d21e9f4910deddd] [git-branch=cloud-cdc-5.0] [utc-build-time="2021-06-26 14:16:57"] [go-version="go version go1.16.3 linux/amd64"] [failpoint-build=false]
      
@amyangfei
Copy link
Contributor

amyangfei commented Jul 12, 2021

Verify this issue after #2230 is fixed, since

@cyliu0 cyliu0 closed this as completed Aug 12, 2021
@AkiraXie AkiraXie added the area/ticdc Issues or PRs related to TiCDC. label Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ticdc Issues or PRs related to TiCDC. severity/critical type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

4 participants