Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-sync in case of stale offset #48578

Merged

Conversation

rodireich
Copy link
Contributor

@rodireich rodireich commented Nov 20, 2024

What

With this change mysql will now correctly do a re-sync in case a saved CDC offset is found to be stale, and mysql is configured to go to resync rather than fail.
image

How

CDCPartitionsCreator goes into the following flow:

  1. Round 1: All streams are rolled back to an initial state so the next sync attempt will start over.
  2. Round 2: A transient exception is thrown so platform will kick off another sync attempt.
  3. On the next attempt all streams are rebuilt from scratch.

Review guide

  1. CdcPartitionsCreator - new handling of re-sync flow in 2 rounds.
  2. StateManager - adding a reset streams state to be able to roll back and start over.
  3. MySqlDebeziumOperations - Identify the stale offset + configured to re-sync and trigger the new flow.

Copy link

vercel bot commented Nov 20, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Nov 21, 2024 4:27pm

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues CDK Connector Development Kit connectors/source/mysql labels Nov 20, 2024
when (streamFeedBootstrap.currentState?.isEmpty) {
false -> streamFeedBootstrap.currentState!!
else -> return coldStart(streamState)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes a bug:
When sync that includes a number of feeds with some that haven't started yet, during checkpoint those feeds will have a stream state of {}.
Stopping and restarting the sync need to treat them as a cold start. before this fix those {} feeds didn't do a snapshot.

@rodireich
Copy link
Contributor Author

rodireich commented Nov 21, 2024

While verifying this pr I noticed another bug that is not directly related to resync but has to do with how empty tables save their state.

I'm fixing it but please take a look meantime

Issue fixed in #48593 branched out of this PR.

@rodireich rodireich marked this pull request as ready for review November 21, 2024 01:32
@rodireich rodireich requested a review from a team as a code owner November 21, 2024 01:32
@rodireich rodireich changed the title 10757 mysql beta verify flow for stale offset with re sync data Re-sync in case of stale offset Nov 21, 2024
if (CDCNeedsRestart) {
globalLockResource.markCdcAsComplete()
throw TransientErrorException(
"Saved offset no longer present on the server, Airbyte is going to trigger a sync from scratch."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "presents"

@rodireich rodireich merged commit e193fb3 into master Nov 21, 2024
33 checks passed
@rodireich rodireich deleted the 10757-mysql-beta-verify-flow-for-stale-offset-with-re-sync-data branch November 21, 2024 17:27
tryangul pushed a commit that referenced this pull request Nov 21, 2024
frifriSF59 pushed a commit that referenced this pull request Nov 21, 2024
matteogp pushed a commit that referenced this pull request Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues CDK Connector Development Kit connectors/source/mysql
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants