Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CCR] Added auto follow patterns feature #33118

Merged
merged 25 commits into from
Sep 6, 2018
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
75b77f8
[CCR] Added auto follow patterns feature
martijnvg Aug 21, 2018
e5160d5
Merge remote-tracking branch 'es/ccr' into ccr_auto_follow
martijnvg Aug 27, 2018
78c178a
iter
martijnvg Aug 27, 2018
80b517e
single line for each field in hashcode method
martijnvg Aug 27, 2018
f818a5d
fixed delete auto follow pattern and added unit tests for it
martijnvg Aug 27, 2018
1292150
Merge remote-tracking branch 'es/ccr' into ccr_auto_follow
martijnvg Aug 28, 2018
c7c2287
iter and moved logic to determine leader indices to follow and logic to
martijnvg Aug 28, 2018
3baa210
make handleClusterAlias() method more readable
martijnvg Aug 28, 2018
812ebb7
since the test does not explicitly execute create_and_follow we need
martijnvg Aug 28, 2018
1943f41
Merge remote-tracking branch 'es/ccr' into ccr_auto_follow
martijnvg Aug 29, 2018
5fcfb3e
added missing @Override
martijnvg Aug 29, 2018
d2b0643
rename
martijnvg Aug 29, 2018
1634592
Keep track of all updateError instances
martijnvg Aug 29, 2018
5edc457
Added a TODO and passed down follower cluster state so that in the fu…
martijnvg Aug 29, 2018
5045a62
pushed too fast
martijnvg Aug 29, 2018
df5fa78
Merge remote-tracking branch 'es/ccr' into ccr_auto_follow
martijnvg Aug 30, 2018
91dedcd
changed api path from /_ccr/_autofollow to /_ccr/_auto_follow
martijnvg Aug 30, 2018
cf792d9
iter
martijnvg Aug 30, 2018
4c4caee
simplified innerPut() method and added AutoFollowPattern.match() method
martijnvg Aug 30, 2018
01819ab
iter delete autofollow api
martijnvg Aug 30, 2018
6907ce9
Merge branch 'master' into ccr_auto_follow
jasontedor Sep 2, 2018
720e53e
Merge remote-tracking branch 'es/master' into ccr_auto_follow
martijnvg Sep 5, 2018
cf984b3
fixed yaml tests
martijnvg Sep 5, 2018
f8e9ffe
Delegate to simpleMatch, it will probably be inlined anyway
jasontedor Sep 6, 2018
7dea40c
Organize endpoints
jasontedor Sep 6, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,35 @@ public void testFollowIndex() throws Exception {
}
}

public void testAutoFollowPatterns() throws Exception {
assumeFalse("Test should only run when both clusters are running", runningAgainstLeaderCluster);

Request request = new Request("PUT", "/_ccr/_autofollow/leader_cluster");
request.setJsonEntity("{\"leader_index_patterns\": [\"logs-*\"]}");
assertOK(client().performRequest(request));

try (RestClient leaderClient = buildLeaderClient()) {
Settings settings = Settings.builder()
.put("index.soft_deletes.enabled", true)
.build();
request = new Request("PUT", "/logs-20190101");
request.setJsonEntity("{\"settings\": " + Strings.toString(settings) +
", \"mappings\": {\"_doc\": {\"properties\": {\"field\": {\"type\": \"keyword\"}}}} }");
assertOK(leaderClient.performRequest(request));

for (int i = 0; i < 5; i++) {
String id = Integer.toString(i);
index(leaderClient, "logs-20190101", id, "field", i, "filtered_field", "true");
}
}

assertBusy(() -> {
Request indexExists = new Request("HEAD", "/logs-20190101");
assertOK(client().performRequest(indexExists));
verifyDocuments("logs-20190101", 5);
});
}

private static void index(RestClient client, String index, String id, Object... fields) throws IOException {
XContentBuilder document = jsonBuilder().startObject();
for (int i = 0; i < fields.length; i += 2) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,21 +39,28 @@
import org.elasticsearch.threadpool.FixedExecutorBuilder;
import org.elasticsearch.threadpool.ThreadPool;
import org.elasticsearch.watcher.ResourceWatcherService;
import org.elasticsearch.xpack.ccr.action.AutoFollowCoordinator;
import org.elasticsearch.xpack.ccr.action.CcrStatsAction;
import org.elasticsearch.xpack.ccr.action.CreateAndFollowIndexAction;
import org.elasticsearch.xpack.ccr.action.DeleteAutoFollowPatternAction;
import org.elasticsearch.xpack.ccr.action.FollowIndexAction;
import org.elasticsearch.xpack.ccr.action.PutAutoFollowPatternAction;
import org.elasticsearch.xpack.ccr.action.ShardChangesAction;
import org.elasticsearch.xpack.ccr.action.ShardFollowNodeTask;
import org.elasticsearch.xpack.ccr.action.ShardFollowTask;
import org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor;
import org.elasticsearch.xpack.ccr.action.TransportCcrStatsAction;
import org.elasticsearch.xpack.ccr.action.TransportDeleteAutoFollowPatternAction;
import org.elasticsearch.xpack.ccr.action.TransportPutAutoFollowPatternAction;
import org.elasticsearch.xpack.ccr.action.UnfollowIndexAction;
import org.elasticsearch.xpack.ccr.action.bulk.BulkShardOperationsAction;
import org.elasticsearch.xpack.ccr.action.bulk.TransportBulkShardOperationsAction;
import org.elasticsearch.xpack.ccr.index.engine.FollowingEngineFactory;
import org.elasticsearch.xpack.ccr.rest.RestCcrStatsAction;
import org.elasticsearch.xpack.ccr.rest.RestCreateAndFollowIndexAction;
import org.elasticsearch.xpack.ccr.rest.RestDeleteAutoFollowPatternAction;
import org.elasticsearch.xpack.ccr.rest.RestFollowIndexAction;
import org.elasticsearch.xpack.ccr.rest.RestPutAutoFollowPatternAction;
import org.elasticsearch.xpack.ccr.rest.RestUnfollowIndexAction;
import org.elasticsearch.xpack.core.XPackPlugin;

Expand Down Expand Up @@ -113,7 +120,14 @@ public Collection<Object> createComponents(
final Environment environment,
final NodeEnvironment nodeEnvironment,
final NamedWriteableRegistry namedWriteableRegistry) {
return Collections.singleton(ccrLicenseChecker);
if (enabled == false) {
return emptyList();
}

return Arrays.asList(
ccrLicenseChecker,
new AutoFollowCoordinator(settings, client, threadPool, clusterService)
);
}

@Override
Expand All @@ -133,7 +147,9 @@ public List<PersistentTasksExecutor<?>> getPersistentTasksExecutor(ClusterServic
new ActionHandler<>(CreateAndFollowIndexAction.INSTANCE, CreateAndFollowIndexAction.TransportAction.class),
new ActionHandler<>(FollowIndexAction.INSTANCE, FollowIndexAction.TransportAction.class),
new ActionHandler<>(ShardChangesAction.INSTANCE, ShardChangesAction.TransportAction.class),
new ActionHandler<>(UnfollowIndexAction.INSTANCE, UnfollowIndexAction.TransportAction.class));
new ActionHandler<>(UnfollowIndexAction.INSTANCE, UnfollowIndexAction.TransportAction.class),
new ActionHandler<>(PutAutoFollowPatternAction.INSTANCE, TransportPutAutoFollowPatternAction.class),
new ActionHandler<>(DeleteAutoFollowPatternAction.INSTANCE, TransportDeleteAutoFollowPatternAction.class));
}

public List<RestHandler> getRestHandlers(Settings settings, RestController restController, ClusterSettings clusterSettings,
Expand All @@ -144,7 +160,10 @@ public List<RestHandler> getRestHandlers(Settings settings, RestController restC
new RestCcrStatsAction(settings, restController),
new RestCreateAndFollowIndexAction(settings, restController),
new RestFollowIndexAction(settings, restController),
new RestUnfollowIndexAction(settings, restController));
new RestUnfollowIndexAction(settings, restController),
new RestPutAutoFollowPatternAction(settings, restController),
new RestDeleteAutoFollowPatternAction(settings, restController)
);
}

public List<NamedWriteableRegistry.Entry> getNamedWriteables() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import org.elasticsearch.common.settings.Setting;
import org.elasticsearch.common.settings.Setting.Property;
import org.elasticsearch.common.unit.TimeValue;

import java.util.Arrays;
import java.util.List;
Expand All @@ -32,6 +33,12 @@ private CcrSettings() {
public static final Setting<Boolean> CCR_FOLLOWING_INDEX_SETTING =
Setting.boolSetting("index.xpack.ccr.following_index", false, Setting.Property.IndexScope);

/**
* Setting for controlling the interval in between polling leader clusters to check whether there are indices to follow
*/
public static final Setting<TimeValue> CCR_AUTO_FOLLOW_POLL_INTERVAL =
Setting.timeSetting("xpack.ccr.auto_follow.poll_interval", TimeValue.timeValueMillis(2500), Property.NodeScope);

/**
* The settings defined by CCR.
*
Expand All @@ -40,7 +47,8 @@ private CcrSettings() {
static List<Setting<?>> getSettings() {
return Arrays.asList(
CCR_ENABLED_SETTING,
CCR_FOLLOWING_INDEX_SETTING);
CCR_FOLLOWING_INDEX_SETTING,
CCR_AUTO_FOLLOW_POLL_INTERVAL);
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,261 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License;
* you may not use this file except in compliance with the Elastic License.
*/
package org.elasticsearch.xpack.ccr.action;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.elasticsearch.action.ActionListener;
import org.elasticsearch.action.admin.cluster.state.ClusterStateRequest;
import org.elasticsearch.client.Client;
import org.elasticsearch.cluster.ClusterChangedEvent;
import org.elasticsearch.cluster.ClusterState;
import org.elasticsearch.cluster.ClusterStateApplier;
import org.elasticsearch.cluster.ClusterStateUpdateTask;
import org.elasticsearch.cluster.metadata.IndexMetaData;
import org.elasticsearch.cluster.metadata.MetaData;
import org.elasticsearch.cluster.service.ClusterService;
import org.elasticsearch.common.regex.Regex;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.unit.TimeValue;
import org.elasticsearch.threadpool.ThreadPool;
import org.elasticsearch.xpack.ccr.CcrSettings;
import org.elasticsearch.xpack.core.ccr.AutoFollowMetadata;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicReference;
import java.util.function.BiConsumer;
import java.util.function.Consumer;
import java.util.function.Function;

/**
* A component that runs only on the elected master node and follows leader indices automatically
* if they match with a auto follow pattern that is defined in {@link AutoFollowMetadata}.
*/
public class AutoFollowCoordinator implements ClusterStateApplier {

private static final Logger LOGGER = LogManager.getLogger(AutoFollowCoordinator.class);

private final Client client;
private final TimeValue pollInterval;
private final ThreadPool threadPool;
private final ClusterService clusterService;

private volatile boolean localNodeMaster = false;

public AutoFollowCoordinator(Settings settings,
Client client,
ThreadPool threadPool,
ClusterService clusterService) {
this.client = client;
this.threadPool = threadPool;
this.clusterService = clusterService;

this.pollInterval = CcrSettings.CCR_AUTO_FOLLOW_POLL_INTERVAL.get(settings);
clusterService.addStateApplier(this);
}

void doAutoFollow() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be private?

if (localNodeMaster == false) {
return;
}
ClusterState localClusterState = clusterService.state();
AutoFollowMetadata autoFollowMetadata = localClusterState.getMetaData().custom(AutoFollowMetadata.TYPE);
if (autoFollowMetadata == null) {
threadPool.schedule(pollInterval, ThreadPool.Names.SAME, this::doAutoFollow);
return;
}

if (autoFollowMetadata.getPatterns().isEmpty()) {
threadPool.schedule(pollInterval, ThreadPool.Names.SAME, this::doAutoFollow);
return;
}

Consumer<Exception> handler = e -> {
if (e != null) {
LOGGER.error("Failure occurred during auto following indices", e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be at the warn level.

}
threadPool.schedule(pollInterval, ThreadPool.Names.SAME, this::doAutoFollow);
};
AutoFollower operation = new AutoFollower(client, handler, autoFollowMetadata) {

void clusterStateApiCall(Client remoteClient, BiConsumer<ClusterState, Exception> handler) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us call this parameter a name other than remoteClient since it might not be a remote client. How about leaderClient?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of the clusterStateApiCall name; how about getLeaderClusterState?

ClusterStateRequest request = new ClusterStateRequest();
request.clear();
request.metaData(true);
remoteClient.admin().cluster().state(request,
ActionListener.wrap(
r -> handler.accept(r.getState(), null),
e -> handler.accept(null, e)
)
);
}

void createAndFollowApiCall(FollowIndexAction.Request followRequest, Consumer<Exception> handler) {
client.execute(CreateAndFollowIndexAction.INSTANCE, new CreateAndFollowIndexAction.Request(followRequest),
ActionListener.wrap(r -> handler.accept(null), handler::accept));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be handler instead of handler::accept.

}

void updateAutoMetadata(Function<ClusterState, ClusterState> updateFunction, Consumer<Exception> handler) {
clusterService.submitStateUpdateTask("update_auto_follow_metadata", new ClusterStateUpdateTask() {

@Override
public ClusterState execute(ClusterState currentState) throws Exception {
return updateFunction.apply(currentState);
}

@Override
public void onFailure(String source, Exception e) {
handler.accept(e);
}

@Override
public void clusterStateProcessed(String source, ClusterState oldState, ClusterState newState) {
handler.accept(null);
}
});
}

};
operation.autoFollowIndices();
}

@Override
public void applyClusterState(ClusterChangedEvent event) {
final boolean beforeLocalMasterNode = localNodeMaster;
localNodeMaster = event.localNodeMaster();
if (beforeLocalMasterNode == false && localNodeMaster) {
threadPool.schedule(pollInterval, ThreadPool.Names.SAME, this::doAutoFollow);
}
}

abstract static class AutoFollower {

private final Client client;
private final Consumer<Exception> handler;
private final AutoFollowMetadata autoFollowMetadata;

private final AtomicInteger executedRequests = new AtomicInteger(0);
private final AtomicReference<Exception> errorHolder = new AtomicReference<>();

AutoFollower(Client client, Consumer<Exception> handler, AutoFollowMetadata autoFollowMetadata) {
this.client = client;
this.handler = handler;
this.autoFollowMetadata = autoFollowMetadata;
}

void autoFollowIndices() {
for (Map.Entry<String, AutoFollowMetadata.AutoFollowPattern> entry : autoFollowMetadata.getPatterns().entrySet()) {
String clusterAlias = entry.getKey();
AutoFollowMetadata.AutoFollowPattern autoFollowPattern = entry.getValue();
Client remoteClient = clusterAlias.equals("_local_") ? client : client.getRemoteClusterClient(clusterAlias);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this variable is misleading since it might not be a remote client.

List<String> followedIndices = autoFollowMetadata.getFollowedLeaderIndexUUIDS().get(clusterAlias);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lowercase s in UUIDS.


clusterStateApiCall(remoteClient, (remoteClusterState, e) -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us avoid the name remoteClusterState since it might be a local cluster state.

if (remoteClusterState != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us assert that e is null?

handleClusterAlias(clusterAlias, autoFollowPattern, followedIndices, remoteClusterState);
} else {
finalise(e);
}
});
}
}

private void handleClusterAlias(String clusterAlias, AutoFollowMetadata.AutoFollowPattern autoFollowPattern,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us break this method down a lot. We can chunk it into unit-testable pieces and it should help with readability.

List<String> followedIndexUUIDS, ClusterState remoteClusterState) {
List<IndexMetaData> leaderIndicesToFollow = new ArrayList<>();
String[] patterns = autoFollowPattern.getLeaderIndexPatterns().toArray(new String[0]);
for (IndexMetaData indexMetaData : remoteClusterState.getMetaData()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, this loop could be a method: getLeaderIndicesToFollow and now that's unit-testable. And if we complicate the patterns in the future beyond a simple match, we have an obvious place to change the code, and an obvious place to add very small unit tests.

if (Regex.simpleMatch(patterns, indexMetaData.getIndex().getName())) {
if (followedIndexUUIDS.contains(indexMetaData.getIndex().getUUID()) == false) {
leaderIndicesToFollow.add(indexMetaData);
}
}
}
if (leaderIndicesToFollow.isEmpty()) {
finalise(null);
} else {
AtomicInteger numRequests = new AtomicInteger();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be a CountDown too? And perhaps a clearer name than numRequests.

for (IndexMetaData indexToFollow : leaderIndicesToFollow) {
String leaderIndexName = indexToFollow.getIndex().getName();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this block of code can be a method getFollowerIndexName and again we have another unit-testable chunk. If we add patterns more complicated in the future, we again have an obvious place to make the change, and an obvious place to test.

String followIndexName = leaderIndexName;
if (autoFollowPattern.getFollowIndexPattern() != null) {
followIndexName = autoFollowPattern.getFollowIndexPattern().replace("{{leader_index}}", leaderIndexName);
}

String leaderIndexNameWithClusterAliasPrefix = clusterAlias.equals("_local_") ? leaderIndexName :
clusterAlias + ":" + leaderIndexName;
FollowIndexAction.Request followRequest =
new FollowIndexAction.Request(leaderIndexNameWithClusterAliasPrefix, followIndexName,
autoFollowPattern.getMaxBatchOperationCount(), autoFollowPattern.getMaxConcurrentReadBatches(),
autoFollowPattern.getMaxOperationSizeInBytes(), autoFollowPattern.getMaxConcurrentWriteBatches(),
autoFollowPattern.getMaxWriteBufferSize(), autoFollowPattern.getRetryTimeout(),
autoFollowPattern.getIdleShardRetryDelay());

// This runs on the elected master node, so we can update cluster state here:
Consumer<Exception> handler = followError -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is difficult to read because it breaks the usual linear flow of reading code. So we are defining a callback, and then within that defining another callback and then it's really the updateAutoMetadata that executes first, and then there's another callback defined. I think it might be easier if we broke it down into onSuccess/onFailure instead of the Consumer<Exception> so that we can see the logic separately and think of it as "this is what happens if there is success, and this is what happens if there is failure".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because really what we are after here is:

  • try to create the follower index and follow the leader
  • if that succeeds, update the auto-follow metadata, then do some accounting
  • if that fails, do some accounting and track the failure

it would be nice if the code read a little closer to that.

if (followError != null) {
LOGGER.error("Failed to auto follow leader index [" + leaderIndexName + "]", followError);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be warn level.

if (numRequests.incrementAndGet() == leaderIndicesToFollow.size()) {
finalise(followError);
}
return;
}
Function<ClusterState, ClusterState> clusterStateUpdateFunction = currentState -> {
AutoFollowMetadata currentAutoFollowMetadata = currentState.metaData().custom(AutoFollowMetadata.TYPE);

Map<String, List<String>> newFollowedIndexUUIDS =
new HashMap<>(currentAutoFollowMetadata.getFollowedLeaderIndexUUIDS());
newFollowedIndexUUIDS.get(clusterAlias).add(indexToFollow.getIndexUUID());

ClusterState.Builder newState = ClusterState.builder(currentState);
AutoFollowMetadata newAutoFollowMetadata =
new AutoFollowMetadata(currentAutoFollowMetadata.getPatterns(), newFollowedIndexUUIDS);
newState.metaData(MetaData.builder(currentState.getMetaData())
.putCustom(AutoFollowMetadata.TYPE, newAutoFollowMetadata)
.build());
return newState.build();
};
updateAutoMetadata(clusterStateUpdateFunction, updateError -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe updateAutoFollowMetadata?

if (updateError != null) {
LOGGER.error("Failed to mark leader index [" + leaderIndexName + "] as auto followed", updateError);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I am concerned about the error handling here. What happens if we have created the follower and actively started replication, but failed to add the index to the metadata? I think that it means we will retry on the next auto follower execution, which will then fail because the follower index already exists. And we will keep retrying. I think we need some protection here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it means we will retry on the next auto follower execution, which will then fail because the follower index already exists. And we will keep retrying.

Agreed. that would be a bad situation. I suspect such a thing is possible when the current master node fails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only think that I can think of that we can do here without making this code more complicated if to have another background check that verifies that there is an entry for all leader indices being followed. We could check persistent tasks to figure this out or use the leader index metadata that we're going to add as custom to IndexMetaData. What do you think about this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am thinking something along those lines too. Like: if a leader index matches a pattern, and we are already following it, then we can skip create and follow and only need to get that index into the metadata. Your approach of using the follower index metadata about the leader index it is following sounds like a good approach to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it then makes sense to do this in a follow up PR? (because the follower index metadata needs to be added first)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think so. I think we can put some skeleton code into place now, with a bunch of TODOs. Would you add this to the meta-issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an entry to #33007

} else {
LOGGER.debug("Successfully marked leader index [{}] as auto followed", leaderIndexName);
}
if (numRequests.incrementAndGet() == leaderIndicesToFollow.size()) {
finalise(updateError);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think that something like this means that we are losing errors. So we have:

  • a pattern metricbeat-*
  • let's say that matches two indices in the leader cluster metricbeat-2018-08-27 and metricbeat-2018-08-28
  • we have an outer CountDown AutoFollower#countDown initialized to one for the single pattern
  • we have an inner CountDown countDown in handleClusterAlias initialized to two for the two indices
  • the first cluster state update task for the first index fails
  • we count down handleClusterAlias#countDown and it decrements to one; we do nothing
  • the second cluster state update task for the second index succeeds, or fails; either way, we would count down handleClusterAlias#countDown again and the exception from the first failure would be lost

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll keep track of the errorHolder and suppress subsequent errors to the first error if multiple exceptions occur.

}
});
};
LOGGER.info("Auto following leader index [{}] as follow index [{}]", leaderIndexName, followIndexName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the placement of this log message is misleading since we can still fail after this.

createAndFollowApiCall(followRequest, handler);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that createAndFollow would be a fine name.

}
}
}

private void finalise(Exception failure) {
if (errorHolder.compareAndSet(null, failure) == false) {
errorHolder.get().addSuppressed(failure);
}

if (executedRequests.incrementAndGet() == autoFollowMetadata.getPatterns().size()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a CountDown since it gives us safety about never over counting down?

handler.accept(errorHolder.get());
}
}

// abstract methods to make unit testing possible:

abstract void clusterStateApiCall(Client remoteClient, BiConsumer<ClusterState, Exception> handler);

abstract void createAndFollowApiCall(FollowIndexAction.Request followRequest, Consumer<Exception> handler);

abstract void updateAutoMetadata(Function<ClusterState, ClusterState> updateFunction, Consumer<Exception> handler);

}
}
Loading