Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDC implementation for Citus using Logical Replication #6623

Merged
merged 57 commits into from
Mar 28, 2023

Conversation

rajeshkt78
Copy link
Contributor

@rajeshkt78 rajeshkt78 commented Jan 13, 2023

DESCRIPTION: Implementing CDC changes using Logical Replication to avoid re-publishing events multiple times by setting up replication origin session, which will add "DoNotReplicateId" to every WAL entry.

  • shard splits
  • shard moves
  • create distributed table
  • undistribute table
  • alter distributed tables (for some cases)
  • reference table operations

The citus decoder which will be decoding WAL events for CDC clients,
ignores any WAL entry with replication origin that is not zero.
It also maps the shard names to distributed table names.

@rajeshkt78 rajeshkt78 self-assigned this Jan 13, 2023
@codecov
Copy link

codecov bot commented Jan 17, 2023

Codecov Report

Merging #6623 (4f2fede) into main (9bab819) will increase coverage by 0.01%.
The diff coverage is 92.13%.

❗ Current head 4f2fede differs from pull request most recent head 55abb26. Consider uploading reports for the commit 55abb26 to get more accurate results

@@            Coverage Diff             @@
##             main    #6623      +/-   ##
==========================================
+ Coverage   93.14%   93.16%   +0.01%     
==========================================
  Files         260      262       +2     
  Lines       56140    56327     +187     
==========================================
+ Hits        52293    52478     +185     
- Misses       3847     3849       +2     

gurkanindibay added a commit that referenced this pull request Jan 19, 2023
Previously trigger type was pull_request and events were
opened, reopened and synchronize.
In PR #6623, packaging pipeline was not triggered and PR was blocked
. It may be a bug in GH Actions side.
To make sure of execution of the pipeline, I changed pipeline
execution type into 'branches'
@rajeshkt78 rajeshkt78 force-pushed the cdc_with_logical_replication_new branch from 70a88d8 to a9b2a59 Compare January 20, 2023 14:19
Copy link
Member

@marcocitus marcocitus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test approach makes sense to me, for the basic test it would be good to also include update/delete

@marcocitus
Copy link
Member

I'm getting this error a lot when something fails and presumably the decoder restarts:

2023-03-21 15:38:38.521 CET [14916] ERROR:  could not receive data from WAL stream: ERROR:  could not attach to dynamic shared memory segment corresponding to handle:0
        CONTEXT:  slot "citus_cdc", output plugin "citus", in the change callback, associated LSN 2/74627410
2023-03-21 15:38:38.524 CET [19342] LOG:  background worker "logical replication worker" (PID 14916) exited with exit code 1

this is mainly related to splits, can we avoid it somehow?

typedef struct
{
Oid shardId;
Oid distributedTableId;
Copy link
Member

@marcocitus marcocitus Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even after combining with #6776 we're no longer seeing inserts after an alter_distributed_table, not sure why, but we can look at it later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe SetupReplicationOriginLocalSession need to go into ConvertTable itself so that we cover all different conversions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alter_distribute_table changes the Oid of the distributed table, so the CDC client's mapping is no longer valid and hence any inserts/update/deletes after that, is not able to be applied by the CDC subscriber. The only current option is for the subscriber to drop the subscription and recreate the subscription again. This will be documented as a known issue.

@onderkalaci
Copy link
Contributor

Implemented replication origin session using DoNotReplicateId

I guess we can improve the title of the PR

@rajeshkt78 rajeshkt78 changed the title Implemented replication origin session using DoNotReplicateId CDC implementation for Citus using Logical Replication Mar 22, 2023
@rajeshkt78
Copy link
Contributor Author

Implemented replication origin session using DoNotReplicateId

I guess we can improve the title of the PR

I have changed the Title now, please check if it is ok..

Copy link
Member

@marcocitus marcocitus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good to me. Huge leap forward in terms of setting up CDC for Citus!

I think we can do the "unbundling" of the decoder plugin as a separate PR.

"_PG_output_plugin_init",
false, NULL);
(LogicalOutputPluginInit) (void *)
load_external_function("pgoutput",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe make it a compile-time constant for now (ideally one that can be overridden via -D option)

@rajeshkt78 rajeshkt78 enabled auto-merge (squash) March 28, 2023 10:11
@rajeshkt78 rajeshkt78 merged commit 85b8a2c into main Mar 28, 2023
@rajeshkt78 rajeshkt78 deleted the cdc_with_logical_replication_new branch March 28, 2023 10:30
rajeshkt78 added a commit to citusdata/the-process that referenced this pull request Apr 6, 2023
Added 3 library dependencies for CDC TAP tests:
libdbi-perl
libdbd-pg-perl
postgresql-${PG_MAJOR}-wal2json

This is needed for dependencies added for these PRs:
citusdata/citus#6827
citusdata/citus#6623
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants