Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

streaming: create a stream of encoded KV output for a span #57422

Closed
pbardea opened this issue Dec 3, 2020 · 2 comments
Closed

streaming: create a stream of encoded KV output for a span #57422

pbardea opened this issue Dec 3, 2020 · 2 comments
Assignees
Labels
A-disaster-recovery C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery

Comments

@pbardea
Copy link
Contributor

pbardea commented Dec 3, 2020

As a first step towards a producer of a stream for cluster-to-cluster streaming, a variant for core changefeeds needs to be created. The primary difference between core changreeds and this stream is that for cluster streaming, we want the changed rows to be emitted as encoded key values (roachpb.KeyValue).

@pbardea pbardea added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-disaster-recovery labels Dec 3, 2020
@pbardea
Copy link
Contributor Author

pbardea commented Jan 19, 2021

Concretely I think this looks like:

  1. Partitioning the span that needs to be watched with PartitionSpans (optional to start, we can make it all 1 partition).
  2. Creating a KVFeed for each of those partitions.
  3. Emitting the KVFeed events (roachpb.KeyValue and periodic checkpoint events with an hlc.Timestamp) to the SQL client as an initial version (a la core changefeeds)

The first implementation of the stream client can open up a SQL connection and read the events over it. Note that the checkpoint events for each partition should indicate a resolved timestamp for the set of spans for which that partition is responsible for.

craig bot pushed a commit that referenced this issue Feb 18, 2021
60483: bulkio: Implement `CREATE REPLICATION STREAM` r=miretskiy a=miretskiy

Initial implementation of `CREATE REPLICATION STREAM`.

The implementation uses changefeed distflow processing which has
been refactor to accomodate this new use case.

The replication stream expects to receive raw KVs.  This is
accomplished by implementing native encoding in changefeeds:
this encoder emits raw bytes representing keys and values.

The plan hook does a "core" style changefeeds -- that is, it
expects the client to be connected to receive changed rows.

Follow on work will implement replication stream resumer
as well as replication stream sinks.

The other commits in this PR add SQL grammar definitions, as well
as add minor tweaks to CDC code to enable configuration for
streaming use case.

Informs #57422

Release Notes: None

Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
@blathers-crl blathers-crl bot added the T-cdc label Sep 29, 2021
@amruss
Copy link
Contributor

amruss commented Oct 4, 2021

Closing this due to #60483

@amruss amruss closed this as completed Oct 4, 2021
@exalate-issue-sync exalate-issue-sync bot removed the T-cdc label Feb 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery
Projects
None yet
Development

No branches or pull requests

5 participants