[WIP] MOLT Replicator draft docs #20465

taroface · 2025-10-01T05:05:55Z

This PR is still WIP.

Notes for reviewers:

Page	Please review	Notes
Load and Replicate	entire flow, but focus on Replicator setup, usage, troubleshooting	Fetch content is pre-existing
Migration Failback	entire flow	This was completely rewritten
Resume Replication	Replicator usage, any missing context/caveats about resuming	Structure is still rough
MOLT Replicator	whole page	Structure is WIP. Usage section is still barebones. I need to think about a good way to present the flags per dialect.
MOLT Fetch	check for content that should be removed/moved to Replicator	I think I caught everything, but may not understand something

github-actions · 2025-10-01T05:06:20Z

Files changed:

src/current/_data/redirects.yml
src/current/_includes/molt/crdb-to-crdb-migration.md
src/current/_includes/molt/fetch-data-load-output.md
src/current/_includes/molt/fetch-metrics.md
src/current/_includes/molt/fetch-replication-output.md
src/current/_includes/molt/fetch-table-filter-userscript.md
src/current/_includes/molt/migration-prepare-database.md
src/current/_includes/molt/migration-stop-replication.md
src/current/_includes/molt/molt-connection-strings.md
src/current/_includes/molt/molt-docker.md
src/current/_includes/molt/molt-install.md
src/current/_includes/molt/molt-limitations.md
src/current/_includes/molt/molt-setup.md
src/current/_includes/molt/molt-troubleshooting.md
src/current/_includes/molt/optimize-replicator-performance.md
src/current/_includes/molt/oracle-migration-prerequisites.md
src/current/_includes/molt/replicator-flags-usage.md
src/current/_includes/molt/replicator-flags.md
src/current/_includes/molt/replicator-metrics.md
src/current/_includes/v23.1/sidebar-data/migrate.json
src/current/_includes/v23.2/sidebar-data/migrate.json
src/current/_includes/v24.1/sidebar-data/migrate.json
src/current/_includes/v24.2/sidebar-data/migrate.json
src/current/_includes/v24.3/sidebar-data/migrate.json
src/current/_includes/v25.1/sidebar-data/migrate.json
src/current/_includes/v25.2/sidebar-data/migrate.json
src/current/_includes/v25.3/sidebar-data/migrate.json
src/current/_includes/v25.4/sidebar-data/migrate.json
src/current/advisories/a144650.md
src/current/molt/migrate-bulk-load.md
src/current/molt/migrate-data-load-and-replication.md
src/current/molt/migrate-data-load-replicate-only.md
src/current/molt/migrate-failback.md
src/current/molt/migrate-load-replicate.md
src/current/molt/migrate-replicate-only.md
src/current/molt/migrate-resume-replication.md
src/current/molt/migrate-to-cockroachdb.md
src/current/molt/migration-overview.md
src/current/molt/migration-strategy.md
src/current/molt/molt-fetch.md
src/current/molt/molt-replicator.md
src/current/releases/molt.md

netlify · 2025-10-01T05:06:28Z

✅ Deploy Preview for cockroachdb-api-docs canceled.

Name	Link
🔨 Latest commit	`bc7fe84`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-api-docs/deploys/68f2db849cf7220008a51602

netlify · 2025-10-01T05:07:10Z

✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name	Link
🔨 Latest commit	`bc7fe84`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-interactivetutorials-docs/deploys/68f2db841e35cc00082b7b02

netlify · 2025-10-01T05:21:22Z

❌ Deploy Preview for cockroachdb-docs failed. Why did it fail? →

Name	Link
🔨 Latest commit	`bc7fe84`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-docs/deploys/68f2db8446ae7a0007406a58

src/current/_includes/molt/migration-prepare-database.md

ryanluu12345

Excellent work @taroface . Not an easy doc to write, but you made it understandable and clean! Let's bottom out on some of these discussions and ensure the deprecation effort from @tuansydau reflects the reality of what we are documenting.

src/current/_includes/molt/migration-prepare-database.md

src/current/_includes/molt/molt-connection-strings.md

src/current/_includes/molt/replicator-metrics.md

ryanluu12345 · 2025-10-03T16:02:47Z

src/current/molt/migrate-load-replicate.md

+
+{% include molt/molt-setup.md %}
+
+## Start Fetch


So an important note here is that as part of the deprecation of the wrapper, we're mainly removing the invocations of Replicator from MOLT. However, there is some source database Replication setup that we'll still need to perform for PostgreSQL specifically. The reason we have to do this is because we need to create the slot at the time we actually do the snapshot export so we don't have gaps in data.

So that means that we still need to document the behavior when we set certain pg-* flags for setting publication, slots and the relevant drop/recreate behavior. I think we'll need to discuss this a bit more in the next team meeting to clearly lay out what the behavior still is. CC @tuansydau @Jeremyyang920

@taroface resolved here: #20465 (comment)

We resolved it during the call and Tuan's comment above should capture the behavior.

src/current/molt/migrate-resume-replication.md

src/current/molt/molt-replicator.md

ryanluu12345 · 2025-10-07T14:35:08Z

src/current/molt/migrate-failback.md

 ## Prepare the CockroachDB cluster

+{{site.data.alerts.callout_success}}
+For details on enabling CockroachDB changefeeds, refer to [Create and Configure Changefeeds]({% link {{ site.current_cloud_version }}/create-and-configure-changefeeds.md %}).


We need to also ensure that the license and organization are set:

SET CLUSTER SETTING cluster.organization = 'organization'; SET CLUSTER SETTING enterprise.license ='$LICENSE';

src/current/molt/migrate-failback.md

ryanluu12345 · 2025-10-07T14:44:20Z

src/current/molt/migrate-data-load-and-replication.md

-MOLT Fetch replication modes will be deprecated in favor of a separate replication workflow in an upcoming release. This includes the `data-load-and-replication`, `replication-only`, and `failback` modes.
-{{site.data.alerts.end}}
-
-Use `data-load-and-replication` mode to perform a one-time bulk load of source data and start continuous replication in a single command.


@taroface just want to note that given we remove this specific mode in MOLT, I want to make sure we still document how exactly these modes look like if people want to run them manually. I'll go ahead and describe what replication-only and data-load-and-replication look like. Can you please take an action to move this content into the proper section for this new doc? I trust you with figuring out the appropriate location.

CRDB data load and replication

Before the data load, get the latest MVCC timestamp so you have the consistent point:

root@localhost:26257/molt> SELECT cluster_logical_timestamp(); -> cluster_logical_timestamp ---------------------------------- 1759848027465101000.0000000000

Create your changefeed so that the cursor='' is set to the value from above. Now, the changefeed will send data starting from the above MVCC timestamp

For replication-only, you can just create the changefeed and the changefeed will start sending data from "now". However, if you want to send data from a previous time, you can pass in the proper MVCC timestamp which is of the format shown above

Important note!!! Make sure that the GC TTL is set appropriately so the data from the cursor you're using is still valid: https://www.cockroachlabs.com/docs/stable/protect-changefeed-data

To add to the GC detail, which is important to ensure that the changes from back in time where the cursor is, are still valid and are able to be consumed from a changefeed.

Configure GC TTL for a data export or migration

Before starting a data export or migration with MOLT, make sure the GC TTL for the source database is long enough to cover the full duration of the process (for example, the total time it takes for the initial data load).
This ensures that historical data remains available from the changefeed when replication begins.

-- Increase GC TTL to 24 hours (example) ALTER DATABASE <database_name> CONFIGURE ZONE USING gc.ttlseconds = 86400;

Once the changefeed or replication has started successfully (which automatically protects its own data range), you can safely lower the TTL again if necessary to resume normal garbage collection:

-- Restore GC TTL to 5 minutes ALTER DATABASE <database_name> CONFIGURE ZONE USING gc.ttlseconds = 300;

Note: that the time in seconds will depend on the user's expected time for the initial data load, and it must be higher than that number.

@ryanluu12345 @noelcrl Sorry, I'm very confused what this is referring to. We removed migrate-data-load-and-replication because the mode to automatically start replication after initial load is being removed from Fetch. Are you saying this is possible to do manually, and we need to document it? Currently, we only document starting Replicator separately after Fetch (migrate-load-replicate.md).

For replication-only, is that not what migrate-resume-replication.md is describing?

Ahh I see the confusion here. I'm mainly calling attention to how someone would run C2X replication from a cursor. They would run this directly with Replicator in this new flow.

This actually ties to failback so it should be related to this point I made: https://github.com/cockroachdb/docs/pull/20465/files#r2440836804

I see that you already documented it. Sorry, my miss on this. The only remaining thing is Noel's point of the GC TTL to make sure the data at the cursor is still valid.

src/current/molt/migrate-data-load-and-replication.md

src/current/_includes/molt/migration-prepare-database.md

noelcrl · 2025-10-14T18:11:47Z

@taroface
For this section: https://www.cockroachlabs.com/docs/molt/migrate-data-load-replicate-only?filters=oracle#oracle-instant-client

We should also make the user grab their own copy of instant client from Oracle. For the Linux instructions, we should replace:

sudo apt-get install -yqq --no-install-recommends libaio1t64
sudo ln -s /usr/lib/x86_64-linux-gnu/libaio.so.1t64 /usr/lib/x86_64-linux-gnu/libaio.so.1
curl -o /tmp/ora-libs.zip https://replicator.cockroachdb.com/third_party/instantclient-basiclite-linux-amd64.zip
unzip -d /tmp /tmp/ora-libs.zip
sudo mv /tmp/instantclient_21_13/* /usr/lib
export LD_LIBRARY_PATH=/usr/lib

With:

sudo apt-get install -yqq --no-install-recommends libaio1t64
sudo ln -s /usr/lib/x86_64-linux-gnu/libaio.so.1t64 /usr/lib/x86_64-linux-gnu/libaio.so.1
# Download the Oracle Instant Client libraries from Oracle: (https://www.oracle.com/ca-en/database/technologies/instant-client.html) into /tmp/instantclient-basiclite-linux-amd64.zip for example
unzip -d /tmp /tmp/instantclient-basiclite-linux-amd64.zip
sudo mv /tmp/instantclient_21_13/* /usr/lib
export LD_LIBRARY_PATH=/usr/lib

Let me know if you have questions on this, it should be updated for each instance of the oracle instant client instructions throughout the docs.

The actual links for Linux binaries:
Download from the Official Oracle site here (linux amd64) or here (linux x86).

src/current/_includes/molt/oracle-migration-prerequisites.md

ryanluu12345 · 2025-10-17T16:05:30Z

src/current/molt/molt-replicator.md

+
+MOLT Replicator offers three consistency modes for balancing throughput and transactional guarantees:
+
+1. Consistent (default for CockroachDB sources): Preserves per-row order and source transaction atomicity. Concurrent transactions are controlled by `--parallelism`.


I know this is implied since the immediate section implies immediate is the default for postgres, mysql, and oracle, but should we call out that consistent and best effort only apply to "CRDB sources" or "failback mode"?

src/current/molt/molt-replicator.md

crash-overdrive · 2025-10-17T18:18:32Z

Can we link the replicator dashboard in the DOCS?
Replicator Grafana dashboard
Replicator Grafana dashboard for Oracle

src/current/molt/migrate-failback.md

ryanluu12345 · 2025-10-17T19:38:57Z

src/current/_includes/molt/migration-prepare-database.md

+{% include_cached copy-clipboard.html %}
+~~~ sql
+-- Query the current SCN from Oracle
+SELECT CURRENT_SCN FROM V$DATABASE;


@noelcrl , can you please check the Fetch cursor logging to see if we end up printing the relevant SCN that folks should start from? I think right now this is a good step for folks who are using replicator directly (querying the scn directly), but I'm wondering if we need to include this if folks are doing this via Fetch since it should be able to access the current SCN and log out.

CC @tuansydau

Yes, it is logged in dataexport/oracle.go:NewOracleSource() when starting Fetch:

logger.Info().Msgf(fmt.Sprintf("replication-only mode should include the following "+ "replicator flags: --backfillFromSCN %s --scn %s", replicationBackfillSCN, scn))

The message should look like:

replication-only mode should include the following replicator flags: --backfillFromSCN 26685444 --scn 26685786

So if Fetch is being used for a bulk load, and this logic in Fetch isn't being removed, these SCNs can be used instead of doing all of these queries here to find it manually.

If just replication is necessary (no bulk data-load), the user can just grab the current SCN and just use it for both --backfillFromSCN and --scn

I think we may need to change the logging here. Maybe we put it as part of the cursor specific logging like we do for MySQL and Postgres. CC @tuansydau . Tuan can you work with Noel to figure out the best way to do this?

Yeah sure, I'll work with Noel on this

Co-authored-by: Noel Riopel <noel.riopel@cockroachlabs.com>

[wip] MOLT Replicator draft docs

4cfa97a

fix links

01cb4a8

taroface changed the title ~~[wip] MOLT Replicator draft docs~~ [WIP] MOLT Replicator draft docs Oct 1, 2025

fix more links

c84c05c

taroface requested review from Jeremyyang920, noelcrl and ryanluu12345 October 1, 2025 14:37

ryanluu12345 reviewed Oct 3, 2025

View reviewed changes

src/current/_includes/molt/migration-prepare-database.md Show resolved Hide resolved

ryanluu12345 reviewed Oct 3, 2025

View reviewed changes

ryanluu12345 reviewed Oct 7, 2025

View reviewed changes

src/current/molt/migrate-failback.md Outdated Show resolved Hide resolved

ryanluu12345 reviewed Oct 7, 2025

View reviewed changes

src/current/molt/migrate-data-load-and-replication.md Show resolved Hide resolved

noelcrl reviewed Oct 7, 2025

View reviewed changes

src/current/_includes/molt/migration-prepare-database.md Outdated Show resolved Hide resolved

noelcrl reviewed Oct 7, 2025

View reviewed changes

src/current/_includes/molt/migration-prepare-database.md Outdated Show resolved Hide resolved

noelcrl reviewed Oct 7, 2025

View reviewed changes

src/current/_includes/molt/migration-prepare-database.md Outdated Show resolved Hide resolved

noelcrl reviewed Oct 7, 2025

View reviewed changes

src/current/_includes/molt/migration-prepare-database.md Outdated Show resolved Hide resolved

Merge branch 'main' into molt-replicator

56fdf25

ryanluu12345 reviewed Oct 16, 2025

View reviewed changes

src/current/_includes/molt/oracle-migration-prerequisites.md Show resolved Hide resolved

ryanluu12345 reviewed Oct 17, 2025

View reviewed changes

src/current/molt/molt-replicator.md Outdated Show resolved Hide resolved

ryanluu12345 reviewed Oct 17, 2025

View reviewed changes

src/current/molt/migrate-failback.md Show resolved Hide resolved

ryanluu12345 requested a review from crash-overdrive October 17, 2025 18:30

ryanluu12345 reviewed Oct 17, 2025

View reviewed changes

taroface and others added 4 commits October 17, 2025 18:20

address review comments

8ce2c47

Merge branch 'main' into molt-replicator

2c5426e

Update src/current/_includes/molt/migration-prepare-database.md

a32aac3

Co-authored-by: Noel Riopel <noel.riopel@cockroachlabs.com>

restructure replication-only doc

bc7fe84


		MOLT Replicator offers three consistency modes for balancing throughput and transactional guarantees:

		1. Consistent (default for CockroachDB sources): Preserves per-row order and source transaction atomicity. Concurrent transactions are controlled by `--parallelism`.

[WIP] MOLT Replicator draft docs #20465

Are you sure you want to change the base?

[WIP] MOLT Replicator draft docs #20465

Uh oh!

Conversation

taroface commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Files changed:

Uh oh!

netlify bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for cockroachdb-api-docs canceled.

Uh oh!

netlify bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Uh oh!

netlify bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Deploy Preview for cockroachdb-docs failed. Why did it fail? →

Uh oh!

Uh oh!

ryanluu12345 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

noelcrl Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Configure GC TTL for a data export or migration

Uh oh!

taroface Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

noelcrl commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

crash-overdrive commented Oct 17, 2025

Uh oh!

Uh oh!

ryanluu12345 Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

noelcrl Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

taroface commented Oct 1, 2025 •

edited

Loading

github-actions bot commented Oct 1, 2025 •

edited

Loading

netlify bot commented Oct 1, 2025 •

edited

Loading

netlify bot commented Oct 1, 2025 •

edited

Loading

netlify bot commented Oct 1, 2025 •

edited

Loading

noelcrl Oct 7, 2025 •

edited

Loading

taroface Oct 16, 2025 •

edited

Loading

noelcrl commented Oct 14, 2025 •

edited

Loading

ryanluu12345 Oct 17, 2025 •

edited

Loading

noelcrl Oct 17, 2025 •

edited

Loading

ryanluu12345 Oct 17, 2025 •

edited

Loading