Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sink(ticdc): change the directory of storage sink only when ddl event occurs #8881

Merged
merged 5 commits into from
May 9, 2023

Conversation

CharlesCheung96
Copy link
Contributor

@CharlesCheung96 CharlesCheung96 commented May 4, 2023

What problem does this PR solve?

Issue Number: close #8890, close #8891

What is changed and how it works?

  1. Change the directory of storage sink only when ddl event occurs.
  2. Make storage sink support database level DDL.
  • File path changes:
  1. table schema file:
    from:
    {scheme}://{prefix}/{schema}/{table}/{table-version-separator}/schema.json
    to:
    {scheme}://{prefix}/{schema}/{table}/meta/schema_{tso}_{schema-crc32-hash}.json
  2. database schema:
    {scheme}://{prefix}/{schema}/meta/schema_{tso}_{schema-crc32-hash}.json
  3. index file:
    from:
    {scheme}://{prefix}/{schema}/{table}/{table-version-separator}/{partition-separator}/{date-separator}/CDC.index
    to:
    {scheme}://{prefix}/{schema}/{table}/{table-version-separator}/{partition-separator}/{date-separator}/meta/CDC.index

Check List

Tests

  • Unit test
  • Integration test

Questions

Will it cause performance regression or break compatibility?
  • The schema-related storage path structure in storage sink has changed
Do you need to update user documentation, design documentation or monitoring documentation?
  • need to update user documentation and design documentation

Release note

`Fixed the table version directory written by storage sink may change while cdc restart or table scheduling`.
`Support database level DDL in storage sink.`

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented May 4, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • hi-rustin
  • nongfushanquan

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. labels May 4, 2023
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented May 4, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 4, 2023
@CharlesCheung96 CharlesCheung96 force-pushed the refactor_storage_sink branch 2 times, most recently from 7b0bf31 to 787d6f8 Compare May 5, 2023 09:10
@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 5, 2023
@CharlesCheung96
Copy link
Contributor Author

/test all

1 similar comment
@CharlesCheung96
Copy link
Contributor Author

/test all

@CharlesCheung96
Copy link
Contributor Author

/test cdc-integration-storage-test

@CharlesCheung96 CharlesCheung96 marked this pull request as ready for review May 6, 2023 05:25
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 6, 2023
@CharlesCheung96
Copy link
Contributor Author

/test all

@CharlesCheung96
Copy link
Contributor Author

/test cdc-integration-storage-test

@CharlesCheung96
Copy link
Contributor Author

/test all

@CharlesCheung96
Copy link
Contributor Author

/test cdc-integration-storage-test

@CharlesCheung96
Copy link
Contributor Author

/test cdc-integration-storage-test

Copy link
Contributor

@nongfushanquan nongfushanquan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add comments to describe the layout of the data and schema

@ti-chi-bot ti-chi-bot bot added the status/LGT1 Indicates that a PR has LGTM 1. label May 6, 2023
@ti-chi-bot ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label May 8, 2023
@CharlesCheung96
Copy link
Contributor Author

/test all

@CharlesCheung96
Copy link
Contributor Author

/test all

@CharlesCheung96
Copy link
Contributor Author

/retest

@CharlesCheung96
Copy link
Contributor Author

/test dm-compatibility-test

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented May 8, 2023

@CharlesCheung96: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test cdc-integration-kafka-test
  • /test cdc-integration-mysql-test
  • /test cdc-integration-storage-test
  • /test dm-compatibility-test
  • /test dm-integration-test
  • /test engine-integration-test
  • /test verify

Use /test all to run the following jobs that were automatically triggered:

  • pingcap/tiflow/ghpr_verify

In response to this:

/test dm-compatibility-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@CharlesCheung96
Copy link
Contributor Author

/test dm-compatibility-test

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented May 8, 2023

@CharlesCheung96: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test cdc-integration-kafka-test
  • /test cdc-integration-mysql-test
  • /test cdc-integration-storage-test
  • /test dm-compatibility-test
  • /test dm-integration-test
  • /test engine-integration-test
  • /test verify

Use /test all to run the following jobs that were automatically triggered:

  • pingcap/tiflow/ghpr_verify

In response to this:

/test dm-compatibility-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@CharlesCheung96
Copy link
Contributor Author

/retest

2 similar comments
@CharlesCheung96
Copy link
Contributor Author

/retest

@CharlesCheung96
Copy link
Contributor Author

/retest

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented May 9, 2023

@CharlesCheung96: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

trigger some heavy tests which will not run always when PR updated.

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@CharlesCheung96
Copy link
Contributor Author

/test dm-integration-test

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented May 9, 2023

@CharlesCheung96: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
wip/pull-cdc-integration-storage-test d002b27 link true /test cdc-integration-storage-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@CharlesCheung96
Copy link
Contributor Author

/test dm-integration-test

@ti-chi-bot ti-chi-bot bot merged commit e114b15 into pingcap:master May 9, 2023
@CharlesCheung96 CharlesCheung96 added needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. labels May 9, 2023
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #8920.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #8921.

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request May 9, 2023
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
ti-chi-bot bot pushed a commit that referenced this pull request May 9, 2023
CharlesCheung96 added a commit to ti-chi-bot/tiflow that referenced this pull request May 17, 2023
CharlesCheung96 added a commit to ti-chi-bot/tiflow that referenced this pull request May 17, 2023
ti-chi-bot bot pushed a commit that referenced this pull request May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make storage sink support database level DDL Change the directory of storage sink only when ddl event occurs
4 participants