Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDC partial reset acceptance test #14551

Merged
merged 16 commits into from
Jul 15, 2022

Conversation

lmossman
Copy link
Contributor

@lmossman lmossman commented Jul 9, 2022

What

Add acceptance test for partial reset of a CDC postgres source.

This PR also adds a second table to the incremental CDC acceptance test since it was needed for the new test anyway.

How

Describe the solution

@github-actions github-actions bot added area/platform issues related to the platform area/server labels Jul 9, 2022
@lmossman lmossman changed the base branch from master to anne/remove-with-refreshed-catalog July 9, 2022 01:44
@lmossman lmossman temporarily deployed to more-secrets July 9, 2022 01:45 Inactive

// We do not check that the source and the dest are in sync here because removing a stream doesn't
// delete its data in the destination
assertGlobalStateContainsStreams(connectionId, List.of(idAndNameStreamDescriptor));
Copy link
Contributor Author

@lmossman lmossman Jul 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based this test pretty closely on Benoit's test here: https://github.com/airbytehq/airbyte/pull/14406/files#diff-50c027dd934aeff3adcaa28a849ae7c845d18961bedd7a4441e8bd82cef517f9R1056

Stopped here for now. Adding back a table to the CDC source is a bit more complex than in the non-CDC case since some extra SQL is required, and it may require removing and re-creating some CDC-specific resources. If we think this is important to add, we can look into that more next week

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another test that turns a stream off and back on to test the partial reset functionality around adding a stream, so I think that should cover that case and is less complicated than trying to add a table back with the correct CDC sql configurations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmossman if the challenge that you are facing in adding a new table in CDC is around publication then you can the publication creation logic in the init script postgres_init_cdc.sql to include all tables

CREATE
    PUBLICATION airbyte_publication FOR ALL TABLES

@lmossman lmossman changed the title Lmossman/cdc partial reset acceptance test CDC partial reset acceptance test Jul 11, 2022
@lmossman lmossman marked this pull request as ready for review July 11, 2022 21:51
@lmossman lmossman requested a review from subodh1810 July 11, 2022 21:53
@lmossman lmossman temporarily deployed to more-secrets July 11, 2022 21:54 Inactive
@lmossman
Copy link
Contributor Author

lmossman commented Jul 11, 2022

@benmoriceau @jdpgrailsdev @subodh1810 This PR is ready for review now. I had to use my locally built dev postgres source for the last test I added, because it relies on the changes that subodh made to the postgres source which haven't been released yet (so I don't expect CI to pass for this PR yet, but the tests are passing locally!). But from the results of that test, it seems that CDC partial reset logic is working as expected, so we can probably move forward with releasing the postgres source with those changes.

But please lmk if there is anything else that we should be testing here that I missed!

@lmossman lmossman temporarily deployed to more-secrets July 11, 2022 23:22 Inactive
@alovew alovew force-pushed the anne/remove-with-refreshed-catalog branch from cb7780b to b24ae1c Compare July 11, 2022 23:38
@subodh1810
Copy link
Contributor

@lmossman can you pull in the latest master in your branch and re-run the tests without the dev image logic? We did publish a new version of postgres here which does have CDC changes

@lmossman
Copy link
Contributor Author

@lmossman can you pull in the latest master in your branch and re-run the tests without the dev image logic? We did publish a new version of postgres here which does have CDC changes

Will do!

@lmossman lmossman force-pushed the lmossman/cdc-partial-reset-acceptance-test branch from def7f0c to 6e03cba Compare July 12, 2022 17:29
@lmossman lmossman requested a review from a team as a code owner July 12, 2022 17:29
@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation CDK Connector Development Kit kubernetes labels Jul 12, 2022
@lmossman lmossman temporarily deployed to more-secrets July 12, 2022 17:32 Inactive
@lmossman lmossman force-pushed the lmossman/cdc-partial-reset-acceptance-test branch from 6e03cba to befd390 Compare July 12, 2022 17:33
@lmossman lmossman temporarily deployed to more-secrets July 12, 2022 17:36 Inactive
@lmossman lmossman changed the base branch from anne/remove-with-refreshed-catalog to lmossman/update-endpoint July 12, 2022 17:36
@github-actions github-actions bot removed area/connectors Connector related issues area/documentation Improvements or additions to documentation kubernetes labels Jul 12, 2022
Copy link
Contributor

@benmoriceau benmoriceau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


@Test
public void testPartialResetFromStreamSelection() throws Exception {
LOGGER.info("Starting partial reset cdc test");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: you can pass a TestInfo test param to access the test name and log it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

@Test
public void testPartialResetFromStreamSelection() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree but the main issue here is that if we break down this test, we will need to duplicate the first sync as many times as there is tests. It is very time consuming.

assertNull(state.getGlobalState());
}

private WebBackendConnectionUpdate getUpdateInput(final ConnectionRead connection, final AirbyteCatalog catalog, final OperationRead operation) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need to factorize that in my review, will probably move it to the test harness

@lmossman lmossman force-pushed the lmossman/cdc-partial-reset-acceptance-test branch from f04fc4e to ce315b5 Compare July 15, 2022 00:01
@lmossman lmossman temporarily deployed to more-secrets July 15, 2022 00:04 Inactive
@lmossman lmossman temporarily deployed to more-secrets July 15, 2022 01:25 Inactive
@lmossman lmossman temporarily deployed to more-secrets July 15, 2022 01:27 Inactive
@lmossman lmossman temporarily deployed to more-secrets July 15, 2022 18:04 Inactive
@lmossman lmossman force-pushed the lmossman/cdc-partial-reset-acceptance-test branch from 0a6b1f1 to f8fbefe Compare July 15, 2022 21:18
@lmossman lmossman temporarily deployed to more-secrets July 15, 2022 21:20 Inactive
@lmossman lmossman merged commit 6005a9a into master Jul 15, 2022
@lmossman lmossman deleted the lmossman/cdc-partial-reset-acceptance-test branch July 15, 2022 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform area/worker Related to worker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants