Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage sink (csv protocol) data inconsistency after CDC scale and owner switch #8805

Closed
fubinzh opened this issue Apr 18, 2023 · 8 comments
Closed

Comments

@fubinzh
Copy link

fubinzh commented Apr 18, 2023

What did you do?

  1. Create storage sink for csv protocol
bash-5.1# cat /tmp/changefeed.toml
[sink]
protocol='csv'
[sink.csv]
include-commit-ts=true

# cdc  cli  changefeed  create "--server=127.0.0.1:8301" "--sink-uri=s3://tmp/ticdc-cloud-storage-test/storage-cdc-scale-sync-cgv1vrqi0jpfd0985tr0?access-key=minioadmin&secret-access-key=minioadmin&endpoint=http://minio.pingcap.net:9001&force-path-style=true&enable-tidb-extension=true" "--changefeed-id=storage-cdc-scale-sync-cgv1vrqi0jpfd0985tr0" "--config=/tmp/changefeed.toml"
  1. Run storage consumer to consume the data to downstream
  2. Run sysbench prepare
sysbench --db-driver=mysql --mysql-host=`nslookup upstream-tidb.cdc-testbed-tps-1687429-1-63 | awk -F: '{print $2}' | awk 'NR==5' | sed s/[[:space:]]//g`  --mysql-port=4000 --mysql-user=root --mysql-db=workload --tables=32 --table-size=100000 --create_secondary=off --debug=true --threads=32 --mysql-ignore-errors=2013,1213,1105,1205,8022,8027,8028,9004,9007,1062 oltp_write_only prepare
  1. Run sysbench and at the same time, do cdc scale and owner switch (Scale CDC from 6 -> 1 -> 5 -> 2)
sysbench --db-driver=mysql --mysql-host=`nslookup upstream-tidb.cdc-testbed-tps-1687429-1-63 | awk -F: '{print $2}' | awk 'NR==5' | sed s/[[:space:]]//g`  --mysql-port=4000 --mysql-user=root --mysql-db=workload --tables=32 --table-size=100000 --create_secondary=off --time=1800 --debug=true --threads=32 --mysql-ignore-errors=2013,1213,1105,1205,8022,8027,8028,9004,9007,1062 oltp_write_only run
  1. Wait data sync to consumer target, and do data consistency check

What did you expect to see?

Data should be consistent

What did you see instead?

Data inconsistent

l8jtkeZMLk

Versions of the cluster

bash-5.1# /cdc version
Release Version: v6.5.2
Git Commit Hash: 1f3151f933c2243af3832177c1356ca842a4c3a7
Git Branch: heads/refs/tags/v6.5.2
UTC Build Time: 2023-04-12 13:56:59
Go Version: go version go1.19.8 linux/amd64
Failpoint Build: false
@fubinzh fubinzh added area/ticdc Issues or PRs related to TiCDC. type/bug The issue is confirmed as a bug. labels Apr 18, 2023
@fubinzh fubinzh changed the title Storage sink (csv protocal) data inconsistency after CDC scale and owner switch Storage sink (csv protocol) data inconsistency after CDC scale and owner switch Apr 18, 2023
@fubinzh
Copy link
Author

fubinzh commented Apr 18, 2023

cdc log:
cdc-1687429-0.log
cdc-1687429-1.log

@fubinzh
Copy link
Author

fubinzh commented Apr 18, 2023

/found automation

@ti-chi-bot ti-chi-bot added the found/automation Bugs found by automation cases label Apr 18, 2023
@fubinzh
Copy link
Author

fubinzh commented Apr 18, 2023

/severity major

@nongfushanquan
Copy link
Contributor

/label affects-6.5

@fubinzh
Copy link
Author

fubinzh commented Apr 19, 2023

/remove-severity major

@fubinzh
Copy link
Author

fubinzh commented Apr 19, 2023

/severity critical

@nongfushanquan
Copy link
Contributor

/close

@ti-chi-bot
Copy link
Member

@nongfushanquan: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants