Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

streamingccl: physical replication stream from given timestamp #111905

Merged
merged 1 commit into from
Oct 9, 2023

Conversation

stevendanna
Copy link
Collaborator

This adds a RESUME TIMESTAMP option to CREATE VIRTUAL CLUSTER FROM
REPLICATION. When provided, we allow the user to start a replication
stream into an existing virtual cluster.

When the resume timestamp is provided, the replication stream will be
started from that timestamp, with no initial scan. To facilitate this,
we add a new argument to crdb_internal.start_replication_stream that
allows us to pass a destination-choosen start timestamp.

To avoid various catastrophic mistakes, we only allow this when:

  • The destination tenant must be in service mode None

  • The provided resume timestamp equals the last recorded "revert
    timestamp" of the destination tenant. The revert timestamp is set when
    the tenant has been forcibly reverted to a particular timestamp and is
    cleared when the tenant is modified in a way that may invalidate a
    resumption from that timestamp.

  • If the source tenant has a PreviousSourceTenant set, the new
    destination must match that previous source tenant field.

WARNING: Using this correctly requires that the stream is resumed
before garbage collection has progressed past the given resume
timestamp. During normal operation, the replication stream maintains a
protected timestamp to ensure this is the case. However, when
resuming using this new feature, we have no such guarantee.

Epic: none

Release note: None

@blathers-crl
Copy link

blathers-crl bot commented Oct 6, 2023

It looks like your PR touches SQL parser code but doesn't add or edit parser tests. Please make sure you add or edit parser tests if you edit the parser.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@stevendanna stevendanna changed the title WIP: streamingccl: physical replication stream from given timestamp streamingccl: physical replication stream from given timestamp Oct 6, 2023
@stevendanna stevendanna marked this pull request as ready for review October 6, 2023 17:24
@stevendanna stevendanna requested review from a team as code owners October 6, 2023 17:24
@stevendanna stevendanna requested review from adityamaru, smg260, renatolabs and dt and removed request for a team October 6, 2023 17:24
@stevendanna
Copy link
Collaborator Author

stevendanna commented Oct 6, 2023

I need to write some more tests and clean up a bit, but wanted to get this out to get review on the overall shape of it.

Just the last commit is what to look at.

@stevendanna stevendanna removed request for a team, smg260 and renatolabs October 6, 2023 17:25
@stevendanna stevendanna force-pushed the resume-timestamp branch 4 times, most recently from 0d978f4 to 83620fd Compare October 7, 2023 10:37
@@ -94,6 +94,14 @@ CREATE VIRTUAL CLUSTER ("destination-hyphen") FROM REPLICATION OF ("source-hyphe
CREATE VIRTUAL CLUSTER "destination-hyphen" FROM REPLICATION OF "source-hyphen" ON '_' WITH RETENTION = '_' -- literals removed
CREATE VIRTUAL CLUSTER _ FROM REPLICATION OF _ ON 'pgurl' WITH RETENTION = '36h' -- identifiers removed

parse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WITH RESUME here?

)
}
if !tenantRecord.LastRevertTenantTimestamp.Equal(options.resumeTimestamp) {
return errors.Newf("cannot start replication for tenant %q (%d) with resume timestamp %s that doesn't match last resume timestamp %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: that doesn't match last s/resume/revert/g timestamp?



// ClusterID identifies the /requesting/ cluster. If set, this is
// compred to the ClusterID in the tenant record's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/compred/compared/g

}
}
}
replicationStartTime = req.ReplicationStartTime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if tenantRecord.PreviousSourceTenant == nil when req.ReplicationStartTime is not empty, we should errors.AssertionFailed ?

@@ -131,15 +148,19 @@ func alterReplicationJobHook(
"only the system tenant can alter tenant")
}

if alterTenantStmt.Options.ResumeTimestamp != nil {
return nil, nil, nil, false, pgerror.New(pgcode.InvalidParameterValue, "resume timetamp cannot be altered")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/timetamp/timestamp/g

Copy link
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any blocking comments, this is really great. A roachtest or two for failback as future changes would be great but definitely a follow up.

Copy link
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any blocking comments, this is really great. A roachtest or two for failback as future changes would be great but definitely a follow up.

@stevendanna stevendanna force-pushed the resume-timestamp branch 6 times, most recently from 56715bd to d8fb4b3 Compare October 9, 2023 04:44
This adds a RESUME TIMESTAMP option to CREATE VIRTUAL CLUSTER FROM
REPLICATION. When provided, we allow the user to start a replication
stream into an _existing_ virtual cluster.

When the resume timestamp is provided, the replication stream will be
started from that timestamp, with no initial scan. To facilitate this,
we add a new argument to crdb_internal.start_replication_stream that
allows us to pass a destination-choosen start timestamp.

To avoid various catastrophic mistakes, we only allow this when:

- The destination tenant must be in service mode None

- The provided resume timestamp equals the last recorded "revert
  timestamp" of the destination tenant. The revert timestamp is set when
  the tenant has been forcibly reverted to a particular timestamp and is
  cleared when the tenant is modified in a way that may invalidate a
  resumption from that timestamp.

- If the source tenant has a PreviousSourceTenant set, the new
  destination must match that previous source tenant field.

WARNING: Using this correctly requires that the stream is resumed
before garbage collection has progressed past the given resume
timestamp. During normal operation, the replication stream maintains a
protected timestamp to ensure this is the case. However, when
resuming using this new feature, we have no such guarantee.

Epic: none

Release note: None
@stevendanna
Copy link
Collaborator Author

bors r=adityamaru

@craig
Copy link
Contributor

craig bot commented Oct 9, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Oct 9, 2023

Build succeeded:

@craig craig bot merged commit 6de91e3 into cockroachdb:master Oct 9, 2023
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants