-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sink to external storage (cdc) depends on local clock to generate file/dir name is not reliable for distributed system #10374
Labels
type/enhancement
The issue or PR belongs to an enhancement.
Comments
zhangjinpeng87
added
the
type/enhancement
The issue or PR belongs to an enhancement.
label
Dec 27, 2023
20 tasks
#10351 use pd clock as a short term fix. |
This was referenced Jan 22, 2024
CharlesCheung96
added a commit
to ti-chi-bot/tiflow
that referenced
this issue
Jan 24, 2024
CharlesCheung96
added a commit
to ti-chi-bot/tiflow
that referenced
this issue
Jan 24, 2024
CharlesCheung96
added a commit
to ti-chi-bot/tiflow
that referenced
this issue
Jan 24, 2024
CharlesCheung96
added a commit
to ti-chi-bot/tiflow
that referenced
this issue
Jan 24, 2024
CharlesCheung96
added a commit
to ti-chi-bot/tiflow
that referenced
this issue
Feb 7, 2024
CharlesCheung96
added a commit
to ti-chi-bot/tiflow
that referenced
this issue
Feb 9, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
Sink to external storage will output DDL and DML events to files and then upload them to external storage like S3. TiCDC's path generator https://github.com/pingcap/tiflow/blob/master/pkg/sink/cloudstorage/path.go use a local monotonic clock's time as dir/file names which is not reliable under some cases, for example when a changfeed was handled by CDC node1, when CDC node1 crashed or restarted and this changfeed was re-scheduled to CDC node2, if there is clock drift between CDC node1 and CDC node2, from the consumer's perspective, there might be a time rewind issue which may cause the consumer missing some data.
Enhancement
As a distributed system, TiCDC should use a reliable way like global monotonic timestamp to generate file/dir names. In this way, TiCDC can work as expected in case of cross region deployment or there is clock drift between different nodes, and other extreme cases.
Alternatives
Open NTP to keep the clock drift between nodes under some threshold like 500ms, and make sure the changefeed wait for a safe time range after it rescheduled to other nodes. This make TiCDC has a strong dependency with these preparations which is not a good design for a distributed system.
The text was updated successfully, but these errors were encountered: