Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Duplicate Split which cause duplicate data when open scanNewlyAddedTableEnabled #2095

Closed
2 tasks done
EMsnap opened this issue Apr 20, 2023 · 0 comments · Fixed by #2096
Closed
2 tasks done
Labels
bug Something isn't working
Milestone

Comments

@EMsnap
Copy link
Contributor

EMsnap commented Apr 20, 2023

Search before asking

  • I searched in the issues and found nothing similar.

Flink version

FLINK 1.13 - FLINK 1.16 (all version)

Flink CDC version

Latest

Database and its version

Mysql (all version)

Minimal reproduce step

1、open scan.newly-added-table.enabled in mysql source
2、running in snapshot phase
3、stop and restart the job

image

The chunks that generated before stop will be generated again in the second start
So the data will be duplicated, this is unacceptable when table contains large chunk of data which will cause a lot of extra tim e

What did you expect to see?

no duplicate split when job restarts

What did you see instead?

duplicate split when job restarts

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants