[FLINK-34688][cdc-connector] CDC framework split snapshot chunks asynchronously #3510

loserwang1024 · 2024-08-06T07:11:28Z

As shown in https://issues.apache.org/jira/browse/FLINK-34688 :
In Mysql CDC, MysqlSnapshotSplitAssigner splits snapshot chunks asynchronously(#931). But CDC framework lacks it.
If table is too big to split, the enumerator will be stuck, and checkpoint will be influenced( sometime will checkpoint timeout occurs).

loserwang1024 · 2024-08-06T07:14:49Z

@leonardBang , @ruanhang1993 , @Jiabao-Sun , @GOODBOY008 , CC

leonardBang · 2024-08-06T07:24:22Z

Thanks @loserwang1024 for the improvement, @GOODBOY008 would you like to help review this PR when you have time>?

Jiabao-Sun · 2024-08-06T07:56:37Z

...c/main/java/org/apache/flink/cdc/connectors/base/source/assigner/splitter/ChunkSplitter.java

 public interface ChunkSplitter {

+    /**
+     * Called to open the chunk splitter to acquire any resources, like threads or jdbc connections.
+     */
+    void open();
+
    /** Generates all snapshot splits (chunks) for the give data collection. */
-    Collection<SnapshotSplit> generateSplits(TableId tableId);
+    Collection<SnapshotSplit> generateSplits(TableId tableId) throws Exception;
+
+    /** Get whether the splitter has more chunks for current table. */
+    boolean hasNextChunk();
+
+    /**
+     * Creates a snapshot of the state of this chunk splitter, to be stored in a checkpoint.
+     *
+     * <p>This method takes the ID of the checkpoint for which the state is snapshotted. Most
+     * implementations should be able to ignore this parameter, because for the contents of the
+     * snapshot, it doesn't matter for which checkpoint it gets created. This parameter can be
+     * interesting for source connectors with external systems where those systems are themselves
+     * aware of checkpoints; for example in cases where the enumerator notifies that system about a
+     * specific checkpoint being triggered.
+     *
+     * @param checkpointId The ID of the checkpoint for which the snapshot is created.
+     * @return an object containing the state of the split enumerator.
+     */
+    ChunkSplitterState snapshotState(long checkpointId);
+
+    TableId getCurrentSplittingTableId();
+
+    /**
+     * Called to open the chunk splitter to release any resources, like threads or jdbc connections.
+     */
+    void close() throws Exception;


Thanks @loserwang1024 for this great work.

I have a small suggestion for the ChunkSplitter interface.
The methods open, hasNextChunk, close, getCurrentSplittingTableId, close are more like those of an Iterator or Cursor. Perhaps ChunkSplitter.generateSplits could be designed to open a Cursor, which can unify the iteration logic and support both one-time splitting and partial splitting. By using cursors, we might be able to support simultaneous spliting of multiple tables.

However, maintaining the state might become complex, as we need to keep track of the state of all open cursors.

Yes, It seem more flexible to support simultaneous splitting of multiple tables..However, we should maintain each table's spitter progress information in state. It may be rather than heavy.

GOODBOY008 · 2024-08-06T11:14:42Z

@loserwang1024 Please rebase to master and solove confilcts.

loserwang1024 · 2024-10-25T05:45:45Z

@loserwang1024 Please rebase to master and solove confilcts.

@GOODBOY008 done it, and all the tests are passed.

leonardBang · 2024-11-20T11:54:54Z

@loserwang1024 Could you kindly rebase your PR to latest master branch to resolve potential conflicts?

loserwang1024 · 2024-11-22T05:51:02Z

@loserwang1024 Could you kindly rebase your PR to latest master branch to resolve potential conflicts?

done it

…chronously

...st/java/org/apache/flink/cdc/connectors/postgres/source/fetch/PostgresScanFetchTaskTest.java

...org/apache/flink/cdc/connectors/base/source/assigner/state/PendingSplitsStateSerializer.java

...a/org/apache/flink/cdc/connectors/base/source/assigner/splitter/JdbcSourceChunkSplitter.java

...rc/main/java/org/apache/flink/cdc/connectors/base/source/assigner/SnapshotSplitAssigner.java

liuxiao2shf · 2024-12-13T07:53:59Z

...rc/main/java/org/apache/flink/cdc/connectors/base/source/assigner/SnapshotSplitAssigner.java

+            } else if (!remainingTables.isEmpty()) {
+                try {
+                    // wait for the asynchronous split to complete
+                    lock.wait();


This lock has already used the synchronized keyword,wait/notify doesn't seem necessary

We need to use wait to release this lock. Then the lock can be gotten by other thread.

.../org/apache/flink/cdc/connectors/sqlserver/source/read/fetch/SqlServerScanFetchTaskTest.java

ruanhang1993

LGTM

…s asynchronously (apache#3510)

github-actions bot added mongodb-cdc-connector base oracle-cdc-connector postgres-cdc-connector sqlserver-cdc-connector db2-cdc-connector labels Aug 6, 2024

leonardBang requested a review from GOODBOY008 August 6, 2024 07:23

Jiabao-Sun reviewed Aug 6, 2024

View reviewed changes

loserwang1024 force-pushed the aync-split branch 2 times, most recently from 7589535 to eb0ecc0 Compare August 7, 2024 11:32

loserwang1024 force-pushed the aync-split branch from cb94500 to 39265b8 Compare August 29, 2024 08:03

loserwang1024 force-pushed the aync-split branch from 39265b8 to 0e7edd5 Compare October 22, 2024 08:02

github-actions bot added the e2e-tests label Oct 22, 2024

loserwang1024 requested a review from Jiabao-Sun October 25, 2024 05:44

Meroksa approved these changes Nov 18, 2024

View reviewed changes

github-actions bot added the reviewed label Nov 18, 2024

Meroksa approved these changes Nov 18, 2024

View reviewed changes

loserwang1024 force-pushed the aync-split branch from 8424716 to 2a6c946 Compare November 22, 2024 04:16

loserwang1024 force-pushed the aync-split branch from 2a6c946 to 5d44e11 Compare December 12, 2024 09:36

[FLINK-34688][cdc-connector] CDC framework split snapshot chunks asyn…

dba2648

…chronously

loserwang1024 force-pushed the aync-split branch from 5d44e11 to dba2648 Compare December 13, 2024 04:08

ruanhang1993 reviewed Dec 13, 2024

View reviewed changes

fix merge test

aedc44b

liuxiao2shf reviewed Dec 13, 2024

View reviewed changes

modify based on CR

700dffc

ruanhang1993 reviewed Dec 13, 2024

View reviewed changes

.../org/apache/flink/cdc/connectors/sqlserver/source/read/fetch/SqlServerScanFetchTaskTest.java Outdated Show resolved Hide resolved

modify based on CR

3bdb052

ruanhang1993 approved these changes Dec 17, 2024

View reviewed changes

github-actions bot added the approved label Dec 17, 2024

ruanhang1993 merged commit 12cf22f into apache:master Dec 17, 2024
24 checks passed

ChaomingZhangCN pushed a commit to ChaomingZhangCN/flink-cdc that referenced this pull request Jan 13, 2025

[FLINK-34688][cdc-connector][base] CDC framework split snapshot chunk…

5fefe31

…s asynchronously (apache#3510)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-34688][cdc-connector] CDC framework split snapshot chunks asynchronously #3510

[FLINK-34688][cdc-connector] CDC framework split snapshot chunks asynchronously #3510

loserwang1024 commented Aug 6, 2024

loserwang1024 commented Aug 6, 2024

leonardBang commented Aug 6, 2024

Jiabao-Sun Aug 6, 2024

loserwang1024 Aug 29, 2024

GOODBOY008 commented Aug 6, 2024

loserwang1024 commented Oct 25, 2024

leonardBang commented Nov 20, 2024

loserwang1024 commented Nov 22, 2024

liuxiao2shf Dec 13, 2024

ruanhang1993 Dec 13, 2024

ruanhang1993 left a comment

[FLINK-34688][cdc-connector] CDC framework split snapshot chunks asynchronously #3510

[FLINK-34688][cdc-connector] CDC framework split snapshot chunks asynchronously #3510

Conversation

loserwang1024 commented Aug 6, 2024

loserwang1024 commented Aug 6, 2024

leonardBang commented Aug 6, 2024

Jiabao-Sun Aug 6, 2024

Choose a reason for hiding this comment

loserwang1024 Aug 29, 2024

Choose a reason for hiding this comment

GOODBOY008 commented Aug 6, 2024

loserwang1024 commented Oct 25, 2024

leonardBang commented Nov 20, 2024

loserwang1024 commented Nov 22, 2024

liuxiao2shf Dec 13, 2024

Choose a reason for hiding this comment

ruanhang1993 Dec 13, 2024

Choose a reason for hiding this comment

ruanhang1993 left a comment

Choose a reason for hiding this comment