Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data stream support to CCR #61993

Merged
merged 21 commits into from
Nov 3, 2020

Conversation

martijnvg
Copy link
Member

This is a draft PR, so that the overall approach can be validated before adding more tests.

This commit adds support data stream support to CCR's auto following by making the following changes:

  • When the auto follow coordinator iterates over the candidate indices to follow,
    the auto follow coordinator also checks whether the index is part of a data stream and
    if the name of data stream also matches with the auto follow pattern then the index
    will be auto followed.
  • When following an index, the put follow api also checks whether that index is part
    of a data stream and if so then also replicates the data stream definition to the
    local cluster.
  • In order for the follow index api to determine whether an index is part of a data
    stream, the cluster state api was modified to also fetch the data stream definition
    of the cluster state if only the state is queried for specific indices.

When a data stream is auto followed, only new backing indices are auto followed.
This is in line with how time based indices patterns are replicated today. This
means that the data stream isn't copied 1 to 1 into the local cluster. The local
cluster's data stream definition contains the same name, timestamp field and
generation, but the list of backing indices may be different (depending on when
a data stream was auto followed).

Closes #56259

This commit adds support data stream support to CCR's auto following by making the following changes:
* When the auto follow coordinator iterates over the candidate indices to follow,
  the auto follow coordinator also checks whether the index is part of a data stream and
  if the name of data stream also matches with the auto follow pattern then the index
  will be auto followed.
* When following an index, the put follow api also checks whether that index is part
  of a data stream and if so then also replicates the data stream definition to the
  local cluster.
* In order for the follow index api to determine whether an index is part of a data
  stream, the cluster state api was modified to also fetch the data stream definition
  of the cluster state if only the state is queried for specific indices.

When a data stream is auto followed, only new backing indices are auto followed.
This is in line with how time based indices patterns are replicated today. This
means that the data stream isn't copied 1 to 1 into the local cluster. The local
cluster's data stream definition contains the same name, timestamp field and
generation, but the list of backing indices may be different (depending on when
a data stream was auto followed).

Closes elastic#56259
@martijnvg martijnvg added >enhancement :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v8.0.0 :Data Management/Data streams Data streams and their lifecycles v7.10.0 labels Sep 4, 2020
@dnhatn
Copy link
Member

dnhatn commented Sep 18, 2020

Thanks @martijnvg (and sorry for the delay. I was offline for the last several days). I think this approach should work. I will do a detailed review when this PR becomes formal. I think we can make some optimizations for replicating data stream indices, such as not creating follow-tasks for backing indices. But let's do it a follow-up after this PR.

@martijnvg
Copy link
Member Author

Thanks @dnhatn, I will make this PR ready for a detailed review.

@martijnvg martijnvg marked this pull request as ready for review September 21, 2020 13:11
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Data streams)

@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Sep 21, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/CCR)

@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team (obsolete) label Sep 21, 2020
@dnhatn
Copy link
Member

dnhatn commented Sep 23, 2020

@elasticmachine update branch

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @martijnvg. I wonder if we should allow using data stream directly in a put-follow request instead of its backing index. We also need a better validation as an auto-follow pattern or a put-follow request with a different follower index does not work.

restoreService.restoreSnapshot(restoreRequest, delegatelistener);
} else {
String followerIndexName = request.getFollowerIndex();
BiConsumer<ClusterState, Metadata.Builder> updater = (currentState, mdBuilder) -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this change directly in RestoreService? A SnapshotInfo from CcrReposity should contain the data streams from the leader cluster?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if that works. If so, then I agree that is a cleaner change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RestoreService currently only restores the data stream metadata if a data stream needs to be restored or an entire snapshot. In the case that a specific backing index is restored the associated data stream isn't restored. The put follow only 'restores' a backing index at a time and if this logic would be included in RestoreService then I think we always need to load and check global metadata state in order to determine whether an index is a backing index of a data stream (when #61525 is merged then some backing indices may not follow the backing index naming scheme).

I think we would need to introduce additional parameter on RestoreSnapshotRequest to indicate that we intent to restore a backing index and therefor the associated data stream should be restored as well (or existing local data stream should be updated). Otherwise regular restore operations would always need to load the global metadata state in order to determine whether an index is a backing index of a data stream. Do you think that this is still an improvement over the current logic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's keep the current implementation. We can look into the improvement once the datastream becomes mature.

@martijnvg
Copy link
Member Author

I wonder if we should allow using data stream directly in a put-follow request instead of its backing index

A data stream is a moving target. Indices get added to it and removed from it. From that perspective I think the put follow api shouldn't follow a data stream, since what this api does is centered around an index.

We also need a better validation as an auto-follow pattern or a put-follow request with a different follower index does not work.

I think I don't fully understand. What validation are you referring to specifically?

@dnhatn
Copy link
Member

dnhatn commented Oct 1, 2020

A data stream is a moving target. Indices get added to it and removed from it. From that perspective I think the put follow api shouldn't follow a data stream, since what this api does is centered around an index.

+1. We should have a better error message to tell users to use auto-follow when they use put-follow to replicate data streams.

We also need a better validation as an auto-follow pattern or a put-follow request with a different follower index does not work.

I think I don't fully understand. What validation are you referring to specifically?

If follow_index_pattern of an auto-follow pattern is specified, then CCR can create a follower index that does not match the data stream of its leader index.

@martijnvg
Copy link
Member Author

We should have a better error message to tell users to use auto-follow when they use put-follow to replicate data streams.

A put follow that targets a data stream should return a client error.

If follow_index_pattern of an auto-follow pattern is specified, then CCR can create a follower index that does not match the data stream of its leader index.

Gotcha. This shouldn't be possible.

@martijnvg martijnvg requested a review from dnhatn October 22, 2020 07:10
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two small questions; otherwise looking good. Thanks Martijn!

if (localDataStream == null) {
// The data stream and the backing indices have been created and validated in the remote cluster,
// just copying the data stream is in this case safe.
return new DataStream(remoteDataStream.getName(), remoteDataStream.getTimeStampField(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can't use the generation from the remoteDataStream as it will break the assertion in DataStream's constructor. For example, here we are following .ds-1 and the ds generation on the remote is 2 already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That assertion will be removed: https://github.com/elastic/elasticsearch/pull/61525/files#diff-c1a5087d37ad81d4b2339167eadc38bcee234a4f7d646b7b4b193ae665075334L60

This to support migrating write alias with indices to a data stream. It was added as sanity check
initially, because backing indices were created by Elasticsearch exclusively.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

// (string sorting works because of the naming backing index naming scheme)
backingIndices.sort(Comparator.comparing(Index::getName));

return new DataStream(localDataStream.getName(), localDataStream.getTimeStampField(), backingIndices,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the generation of the local datastream is higher than the remote?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generation isn't really used yet, so there isn't a real consequence when the local data stream's generation is higher than the leader's data stream.

This would happen if the local data stream would be rollover independently. This is possible, but shouldn't really happen. Because that would create an index that doesn't follow anything and will remain empty and unused. The rollover should only occur in the remote cluster.

So I think we should try to prevent data streams from being rolled over when that data stream is actually just following another data stream. What I think we can do for this is add a flag to a data stream that it is replicated and a rollover would disallow rolling over a replicated data stream. When unfollowing that data stream, the replicate flag would be unset. What do you think about this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we should try to prevent data streams from being rolled over when that data stream is actually just following another data stream. What I think we can do for this is add a flag to a data stream that it is replicated and a rollover would disallow rolling over a replicated data stream. When unfollowing that data stream, the replicate flag would be unset. What do you think about this?

+1. This makes a lot of sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Martijn will implement this in a follow up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in a followup PR, I will add an additional field to DataStream class that indicates that it is a replicated data stream. Then in the rollover api validation will be added that prevents rolling over a replicated data stream.

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @martijnvg.

@martijnvg martijnvg merged commit c4c3c8b into elastic:master Nov 3, 2020
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Nov 3, 2020
This commit adds support data stream support to CCR's auto following by making the following changes:
* When the auto follow coordinator iterates over the candidate indices to follow,
  the auto follow coordinator also checks whether the index is part of a data stream and
  if the name of data stream also matches with the auto follow pattern then the index
  will be auto followed.
* When following an index, the put follow api also checks whether that index is part
  of a data stream and if so then also replicates the data stream definition to the
  local cluster.
* In order for the follow index api to determine whether an index is part of a data
  stream, the cluster state api was modified to also fetch the data stream definition
  of the cluster state if only the state is queried for specific indices.

When a data stream is auto followed, only new backing indices are auto followed.
This is in line with how time based indices patterns are replicated today. This
means that the data stream isn't copied 1 to 1 into the local cluster. The local
cluster's data stream definition contains the same name, timestamp field and
generation, but the list of backing indices may be different (depending on when
a data stream was auto followed).

Closes elastic#56259
martijnvg added a commit that referenced this pull request Nov 3, 2020
Backport of #61993 to 7.x branch.

This commit adds support data stream support to CCR's auto following by making the following changes:
* When the auto follow coordinator iterates over the candidate indices to follow,
  the auto follow coordinator also checks whether the index is part of a data stream and
  if the name of data stream also matches with the auto follow pattern then the index
  will be auto followed.
* When following an index, the put follow api also checks whether that index is part
  of a data stream and if so then also replicates the data stream definition to the
  local cluster.
* In order for the follow index api to determine whether an index is part of a data
  stream, the cluster state api was modified to also fetch the data stream definition
  of the cluster state if only the state is queried for specific indices.

When a data stream is auto followed, only new backing indices are auto followed.
This is in line with how time based indices patterns are replicated today. This
means that the data stream isn't copied 1 to 1 into the local cluster. The local
cluster's data stream definition contains the same name, timestamp field and
generation, but the list of backing indices may be different (depending on when
a data stream was auto followed).

Closes #56259
martijnvg added a commit that referenced this pull request Dec 8, 2020
When a data stream is being auto followed then a rollover in a local cluster can break auto following,
if the local cluster performs a rollover then it creates a new write index and if then later the remote
cluster rolls over as well then that new write index can't be replicated, because it has the same name
as in the write index in the local cluster, which was created earlier.

If a data stream is managed by ccr, then the local cluster should not do a rollover for those data streams.
The data stream should be rolled over in the remote cluster and that change should replicate to the local
cluster. Performing a rollover in the local cluster is an operation that the data stream support in ccr should
perform.

To protect against rolling over a replicated data stream, this PR adds a replicate field to DataStream class.
The rollover api will fail with an error in case a data stream is being rolled over and the targeted data stream is
a replicated data stream. When the put follow api creates a data stream in the local cluster then the replicate flag
is set to true. There should be a way to turn a replicated data stream into a regular data stream when for example
during disaster recovery. The newly added api in this pr (promote data stream api) is doing that. After a replicated
data stream is promoted to a regular data stream then the local data stream can be rolled over, so that the new
write index is no longer a follower index. Also if the put follow api is attempting to update this data stream
(for example to attempt to resume auto following) then that with fail, because the data stream is no longer a
replicated data stream.

Today with time based indices behind an alias, the is_write_index property isn't replicated from remote cluster
to the local cluster, so when attempting to rollover the alias in the local cluster the rollover fails, because the
alias doesn't have a write index. The added replicated field in the DataStream class and added validation
achieve the same kind of protection, but in a more robust way.

A followup from #61993.
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Dec 8, 2020
Backporting elastic#64710 to the 7.x branch.

When a data stream is being auto followed then a rollover in a local cluster can break auto following,
if the local cluster performs a rollover then it creates a new write index and if then later the remote
cluster rolls over as well then that new write index can't be replicated, because it has the same name
as in the write index in the local cluster, which was created earlier.

If a data stream is managed by ccr, then the local cluster should not do a rollover for those data streams.
The data stream should be rolled over in the remote cluster and that change should replicate to the local
cluster. Performing a rollover in the local cluster is an operation that the data stream support in ccr should
perform.

To protect against rolling over a replicated data stream, this PR adds a replicate field to DataStream class.
The rollover api will fail with an error in case a data stream is being rolled over and the targeted data stream is
a replicated data stream. When the put follow api creates a data stream in the local cluster then the replicate flag
is set to true. There should be a way to turn a replicated data stream into a regular data stream when for example
during disaster recovery. The newly added api in this pr (promote data stream api) is doing that. After a replicated
data stream is promoted to a regular data stream then the local data stream can be rolled over, so that the new
write index is no longer a follower index. Also if the put follow api is attempting to update this data stream
(for example to attempt to resume auto following) then that with fail, because the data stream is no longer a
replicated data stream.

Today with time based indices behind an alias, the is_write_index property isn't replicated from remote cluster
to the local cluster, so when attempting to rollover the alias in the local cluster the rollover fails, because the
alias doesn't have a write index. The added replicated field in the DataStream class and added validation
achieve the same kind of protection, but in a more robust way.

A followup from elastic#61993.
martijnvg added a commit that referenced this pull request Dec 8, 2020
Backporting #64710 to the 7.x branch.

When a data stream is being auto followed then a rollover in a local cluster can break auto following,
if the local cluster performs a rollover then it creates a new write index and if then later the remote
cluster rolls over as well then that new write index can't be replicated, because it has the same name
as in the write index in the local cluster, which was created earlier.

If a data stream is managed by ccr, then the local cluster should not do a rollover for those data streams.
The data stream should be rolled over in the remote cluster and that change should replicate to the local
cluster. Performing a rollover in the local cluster is an operation that the data stream support in ccr should
perform.

To protect against rolling over a replicated data stream, this PR adds a replicate field to DataStream class.
The rollover api will fail with an error in case a data stream is being rolled over and the targeted data stream is
a replicated data stream. When the put follow api creates a data stream in the local cluster then the replicate flag
is set to true. There should be a way to turn a replicated data stream into a regular data stream when for example
during disaster recovery. The newly added api in this pr (promote data stream api) is doing that. After a replicated
data stream is promoted to a regular data stream then the local data stream can be rolled over, so that the new
write index is no longer a follower index. Also if the put follow api is attempting to update this data stream
(for example to attempt to resume auto following) then that with fail, because the data stream is no longer a
replicated data stream.

Today with time based indices behind an alias, the is_write_index property isn't replicated from remote cluster
to the local cluster, so when attempting to rollover the alias in the local cluster the rollover fails, because the
alias doesn't have a write index. The added replicated field in the DataStream class and added validation
achieve the same kind of protection, but in a more robust way.

A followup from #61993
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Jul 29, 2022
This commit fixes the situation where a user wants to use CCR to replicate indices that are part of
a data stream while renaming the data stream. For example, assume a user has an auto-follow request
that looks like this:

```
PUT /_ccr/auto_follow/my-auto-follow-pattern
{
  "remote_cluster" : "other-cluster",
  "leader_index_patterns" : ["logs-*"],
  "follow_index_pattern" : "{{leader_index}}_copy"
}
```

And then the data stream `logs-mysql-error` was created, creating the backing index
`.ds-logs-mysql-error-2022-07-29-000001`.

Prior to this commit, replicating this data stream means that the backing index would be renamed to
`.ds-logs-mysql-error-2022-07-29-000001_copy` and the data stream would *not* be renamed. This
caused a check to trip in `TransportPutLifecycleAction` asserting that a backing index was not
renamed for a data stream during following.

After this commit, there are a couple of changes:

First, the data stream will also be renamed. This means that the `logs-mysql-error` becomes
`logs-mysql-error_copy` when created on the follower cluster. Because of the way that CCR works,
this means we need to support renaming a data stream for a regular "create follower" request, so a
new parameter has been added: `data_stream_name`. It works like this:

```
PUT /mynewindex/_ccr/follow
{
  "remote_cluster": "other-cluster",
  "leader_index": "myotherindex",
  "data_stream_name": "new_ds"
}
```

Second, the backing index for a data stream must be renamed in a way that does not break the parsing
of a data stream backing pattern, whereas previously the index
`.ds-logs-mysql-error-2022-07-29-000001` would be renamed to
`.ds-logs-mysql-error-2022-07-29-000001_copy` (an illegal name since it doesn't end with the
rollover digit), after this commit it will be renamed to
`.ds-logs-mysql-error_copy-2022-07-29-000001` to match the renamed data stream. This means that for
the given `follow_index_pattern` of `{{leader_index}}_copy` the index changes look like:

| Leader Cluster | Follower Cluster |
|--------------|-----------|
| `logs-mysql-error` (data stream) | `logs-mysql-error_copy` (data stream) |
| `.ds-logs-mysql-error-2022-07-29-000001`      | `.ds-logs-mysql-error_copy-2022-07-29-000001` |

Which internally means the auto-follow request turned into the create follower request of:

```
PUT /.ds-logs-mysql-error_copy-2022-07-29-000001/_ccr/follow
{
  "remote_cluster": "other-cluster",
  "leader_index": ".ds-logs-mysql-error-2022-07-29-000001",
  "data_stream_name": "logs-mysql-error_copy"
}
```

Relates to elastic#84940 (cherry-picked the commit for a test)
Relates to elastic#61993 (where data stream support was first introduced for CCR)
Resolves elastic#81751
dakrone added a commit that referenced this pull request Aug 1, 2022
This commit fixes the situation where a user wants to use CCR to replicate indices that are part of
a data stream while renaming the data stream. For example, assume a user has an auto-follow request
that looks like this:

```
PUT /_ccr/auto_follow/my-auto-follow-pattern
{
  "remote_cluster" : "other-cluster",
  "leader_index_patterns" : ["logs-*"],
  "follow_index_pattern" : "{{leader_index}}_copy"
}
```

And then the data stream `logs-mysql-error` was created, creating the backing index
`.ds-logs-mysql-error-2022-07-29-000001`.

Prior to this commit, replicating this data stream means that the backing index would be renamed to
`.ds-logs-mysql-error-2022-07-29-000001_copy` and the data stream would *not* be renamed. This
caused a check to trip in `TransportPutLifecycleAction` asserting that a backing index was not
renamed for a data stream during following.

After this commit, there are a couple of changes:

First, the data stream will also be renamed. This means that the `logs-mysql-error` becomes
`logs-mysql-error_copy` when created on the follower cluster. Because of the way that CCR works,
this means we need to support renaming a data stream for a regular "create follower" request, so a
new parameter has been added: `data_stream_name`. It works like this:

```
PUT /mynewindex/_ccr/follow
{
  "remote_cluster": "other-cluster",
  "leader_index": "myotherindex",
  "data_stream_name": "new_ds"
}
```

Second, the backing index for a data stream must be renamed in a way that does not break the parsing
of a data stream backing pattern, whereas previously the index
`.ds-logs-mysql-error-2022-07-29-000001` would be renamed to
`.ds-logs-mysql-error-2022-07-29-000001_copy` (an illegal name since it doesn't end with the
rollover digit), after this commit it will be renamed to
`.ds-logs-mysql-error_copy-2022-07-29-000001` to match the renamed data stream. This means that for
the given `follow_index_pattern` of `{{leader_index}}_copy` the index changes look like:

| Leader Cluster | Follower Cluster |
|--------------|-----------|
| `logs-mysql-error` (data stream) | `logs-mysql-error_copy` (data stream) |
| `.ds-logs-mysql-error-2022-07-29-000001`      | `.ds-logs-mysql-error_copy-2022-07-29-000001` |

Which internally means the auto-follow request turned into the create follower request of:

```
PUT /.ds-logs-mysql-error_copy-2022-07-29-000001/_ccr/follow
{
  "remote_cluster": "other-cluster",
  "leader_index": ".ds-logs-mysql-error-2022-07-29-000001",
  "data_stream_name": "logs-mysql-error_copy"
}
```

Relates to #84940 (cherry-picked the commit for a test)
Relates to #61993 (where data stream support was first introduced for CCR)
Resolves #81751
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Aug 1, 2022
This commit fixes the situation where a user wants to use CCR to replicate indices that are part of
a data stream while renaming the data stream. For example, assume a user has an auto-follow request
that looks like this:

```
PUT /_ccr/auto_follow/my-auto-follow-pattern
{
  "remote_cluster" : "other-cluster",
  "leader_index_patterns" : ["logs-*"],
  "follow_index_pattern" : "{{leader_index}}_copy"
}
```

And then the data stream `logs-mysql-error` was created, creating the backing index
`.ds-logs-mysql-error-2022-07-29-000001`.

Prior to this commit, replicating this data stream means that the backing index would be renamed to
`.ds-logs-mysql-error-2022-07-29-000001_copy` and the data stream would *not* be renamed. This
caused a check to trip in `TransportPutLifecycleAction` asserting that a backing index was not
renamed for a data stream during following.

After this commit, there are a couple of changes:

First, the data stream will also be renamed. This means that the `logs-mysql-error` becomes
`logs-mysql-error_copy` when created on the follower cluster. Because of the way that CCR works,
this means we need to support renaming a data stream for a regular "create follower" request, so a
new parameter has been added: `data_stream_name`. It works like this:

```
PUT /mynewindex/_ccr/follow
{
  "remote_cluster": "other-cluster",
  "leader_index": "myotherindex",
  "data_stream_name": "new_ds"
}
```

Second, the backing index for a data stream must be renamed in a way that does not break the parsing
of a data stream backing pattern, whereas previously the index
`.ds-logs-mysql-error-2022-07-29-000001` would be renamed to
`.ds-logs-mysql-error-2022-07-29-000001_copy` (an illegal name since it doesn't end with the
rollover digit), after this commit it will be renamed to
`.ds-logs-mysql-error_copy-2022-07-29-000001` to match the renamed data stream. This means that for
the given `follow_index_pattern` of `{{leader_index}}_copy` the index changes look like:

| Leader Cluster | Follower Cluster |
|--------------|-----------|
| `logs-mysql-error` (data stream) | `logs-mysql-error_copy` (data stream) |
| `.ds-logs-mysql-error-2022-07-29-000001`      | `.ds-logs-mysql-error_copy-2022-07-29-000001` |

Which internally means the auto-follow request turned into the create follower request of:

```
PUT /.ds-logs-mysql-error_copy-2022-07-29-000001/_ccr/follow
{
  "remote_cluster": "other-cluster",
  "leader_index": ".ds-logs-mysql-error-2022-07-29-000001",
  "data_stream_name": "logs-mysql-error_copy"
}
```

Relates to elastic#84940 (cherry-picked the commit for a test)
Relates to elastic#61993 (where data stream support was first introduced for CCR)
Resolves elastic#81751

(cherry picked from commit 3420be0)
dakrone added a commit that referenced this pull request Aug 1, 2022
This commit fixes the situation where a user wants to use CCR to replicate indices that are part of
a data stream while renaming the data stream. For example, assume a user has an auto-follow request
that looks like this:

```
PUT /_ccr/auto_follow/my-auto-follow-pattern
{
  "remote_cluster" : "other-cluster",
  "leader_index_patterns" : ["logs-*"],
  "follow_index_pattern" : "{{leader_index}}_copy"
}
```

And then the data stream `logs-mysql-error` was created, creating the backing index
`.ds-logs-mysql-error-2022-07-29-000001`.

Prior to this commit, replicating this data stream means that the backing index would be renamed to
`.ds-logs-mysql-error-2022-07-29-000001_copy` and the data stream would *not* be renamed. This
caused a check to trip in `TransportPutLifecycleAction` asserting that a backing index was not
renamed for a data stream during following.

After this commit, there are a couple of changes:

First, the data stream will also be renamed. This means that the `logs-mysql-error` becomes
`logs-mysql-error_copy` when created on the follower cluster. Because of the way that CCR works,
this means we need to support renaming a data stream for a regular "create follower" request, so a
new parameter has been added: `data_stream_name`. It works like this:

```
PUT /mynewindex/_ccr/follow
{
  "remote_cluster": "other-cluster",
  "leader_index": "myotherindex",
  "data_stream_name": "new_ds"
}
```

Second, the backing index for a data stream must be renamed in a way that does not break the parsing
of a data stream backing pattern, whereas previously the index
`.ds-logs-mysql-error-2022-07-29-000001` would be renamed to
`.ds-logs-mysql-error-2022-07-29-000001_copy` (an illegal name since it doesn't end with the
rollover digit), after this commit it will be renamed to
`.ds-logs-mysql-error_copy-2022-07-29-000001` to match the renamed data stream. This means that for
the given `follow_index_pattern` of `{{leader_index}}_copy` the index changes look like:

| Leader Cluster | Follower Cluster |
|--------------|-----------|
| `logs-mysql-error` (data stream) | `logs-mysql-error_copy` (data stream) |
| `.ds-logs-mysql-error-2022-07-29-000001`      | `.ds-logs-mysql-error_copy-2022-07-29-000001` |

Which internally means the auto-follow request turned into the create follower request of:

```
PUT /.ds-logs-mysql-error_copy-2022-07-29-000001/_ccr/follow
{
  "remote_cluster": "other-cluster",
  "leader_index": ".ds-logs-mysql-error-2022-07-29-000001",
  "data_stream_name": "logs-mysql-error_copy"
}
```

Relates to #84940 (cherry-picked the commit for a test)
Relates to #61993 (where data stream support was first introduced for CCR)
Resolves #81751
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features >docs General docs changes >enhancement release highlight Team:Data Management Meta label for data/management team Team:Distributed Meta label for distributed team (obsolete) Team:Docs Meta label for docs team v7.11.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrate data streams with CCR
6 participants