Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt the Recovery API for closed indices #38421

Merged
merged 5 commits into from
Feb 25, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,60 @@
\n
)+
$/

---
"Test cat recovery output for closed index":
- skip:
version: " - 7.99.99"
reason: closed indices are replicated starting version 8.0.0

- do:
indices.create:
index: index2
body:
settings:
index:
number_of_replicas: 0

- do:
indices.close:
index: index2
- is_true: acknowledged

- do:
cluster.health:
index: index2
wait_for_status: green

- do:
cat.recovery:
index: index2
h: i,s,t,ty,st,shost,thost,rep,snap,f,fr,fp,tf,b,br,bp,tb,to,tor,top

- match:
$body: |
/^
(
index2 \s+
\d \s+ # shard
(?:\d+ms|\d+(?:\.\d+)?s) \s+ # time in ms or seconds
existing_store \s+ # source type (always existing_store for closed indices)
done \s+ # stage
tlrx marked this conversation as resolved.
Show resolved Hide resolved
[-\w./]+ \s+ # source_host
[-\w./]+ \s+ # target_host
[-\w./]+ \s+ # repository
[-\w./]+ \s+ # snapshot
\d+ \s+ # files
\d+ \s+ # files_recovered
\d+\.\d+% \s+ # files_percent
\d+ \s+ # files_total
\d+ \s+ # bytes
\d+ \s+ # bytes_recovered
\d+\.\d+% \s+ # bytes_percent
\d+ \s+ # bytes_total
0 \s+ # translog_ops (always 0 for closed indices)
0 \s+ # translog_ops_recovered (always 0 for closed indices)
100\.0% # translog_ops_percent (always 100.0% for closed indices)
\n
)+
$/
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,56 @@
- gte: { test_1.shards.0.verify_index.check_index_time_in_millis: 0 }
- gte: { test_1.shards.0.verify_index.total_time_in_millis: 0 }
---
"Indices recovery test for closed index":
- skip:
version: " - 7.99.99"
reason: closed indices are replicated starting version 8.0.0

- do:
indices.create:
index: test_2
body:
settings:
index:
number_of_replicas: 0

- do:
indices.close:
index: test_2
- is_true: acknowledged

- do:
cluster.health:
index: test_2
wait_for_status: green

- do:
indices.recovery:
index: [test_2]
human: true

- match: { test_2.shards.0.type: "EXISTING_STORE" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on another channel, a user could find it odd that closing an index makes it go through recovery. Also, if cluster health will be changed to by default also look at closed indices, cluster health could go red for a short period when closing an index. This is more of a discussion subject: is this external behaviour acceptable or not. Maybe @ywelsch has a view on this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on another channel, a user could find it odd that closing an index makes it go through recovery.

I can understand this point of view, but I also think that we could mitigate this with a better documentation.

Also, if cluster health will be changed to by default also look at closed indices, cluster health could go red for a short period when closing an index. This is more of a discussion subject: is this external behaviour acceptable or not.

I forgot to mention during our discussion that we talked about this in our team meeting few months ago and we agreed on the fact that this situation is acceptable. This behavior is similar to what happens today when an index is reopened, turning the cluster state to RED until primary shards are assigned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if the red cluster health is too big of a concern, we can change that as well by making it only yellow during this transition (can be done by adapting ClusterShardHealth#getInactivePrimaryHealth).

- match: { test_2.shards.0.stage: "DONE" }
- match: { test_2.shards.0.primary: true }
- match: { test_2.shards.0.start_time: /^2\d\d\d-.+/ }
- match: { test_2.shards.0.target.ip: /^\d+\.\d+\.\d+\.\d+$/ }
- gte: { test_2.shards.0.index.files.total: 0 }
- gte: { test_2.shards.0.index.files.reused: 0 }
- gte: { test_2.shards.0.index.files.recovered: 0 }
- match: { test_2.shards.0.index.files.percent: /^\d+\.\d\%$/ }
- gte: { test_2.shards.0.index.size.total_in_bytes: 0 }
- gte: { test_2.shards.0.index.size.reused_in_bytes: 0 }
- gte: { test_2.shards.0.index.size.recovered_in_bytes: 0 }
- match: { test_2.shards.0.index.size.percent: /^\d+\.\d\%$/ }
- gte: { test_2.shards.0.index.source_throttle_time_in_millis: 0 }
- gte: { test_2.shards.0.index.target_throttle_time_in_millis: 0 }
- gte: { test_2.shards.0.translog.recovered: 0 }
- gte: { test_2.shards.0.translog.total: 0 }
- gte: { test_2.shards.0.translog.total_on_start: 0 }
- gte: { test_2.shards.0.translog.total_time_in_millis: 0 }
- gte: { test_2.shards.0.verify_index.check_index_time_in_millis: 0 }
- gte: { test_2.shards.0.verify_index.total_time_in_millis: 0 }
---
"Indices recovery test index name not matching":

- do:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

package org.elasticsearch.action.admin.indices.recovery;

import org.elasticsearch.action.support.IndicesOptions;
import org.elasticsearch.action.support.broadcast.BroadcastRequest;
import org.elasticsearch.common.Strings;
import org.elasticsearch.common.io.stream.StreamInput;
Expand Down Expand Up @@ -47,7 +48,7 @@ public RecoveryRequest() {
* @param indices Comma-separated list of indices about which to gather recovery information
*/
public RecoveryRequest(String... indices) {
super(indices);
super(indices, IndicesOptions.STRICT_EXPAND_OPEN_CLOSED);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the Recovery API reports information for shards that effectively recovered from something. In a cluster with mixed versions, index are closed but not replicated: they don't have index routing table and shards don't have a recovery state, so they are just not reported by the Recovery API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change will mean that in a mixed cluster, the result of the recovery API will be different if you pass in a closed index depending on which node you hit in the cluster (a v7 or v8). A v7 node will respond failure and a v8 node will respond nothing. Is that correct?

This seems like a very small issue though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good remark and I also thought about it (hence my previoud comment). Your understanding is right, in a mixed cluster if a 8.0 node processes the recovery request it will resolves both closed and opened indices, while a 7.x node will resolve opened indices only.

But then the list of resolved indices is compared to the current cluster state routing table (see TransportRecoveryAction#shards() method) and only indices that have a routing table will be included in the scope of the computation. In mixed cluster, closed indices are not replicated and their routing table is removed from the cluster state at closing time (see #38955) so the situation where a 8.0 node is asking a 7.0 node for information about a shard of a closed index it is assigned to should never happen (also because old node cannot join back an upgraded cluster).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 7.x, we use STRICT_EXPAND_OPEN_FORBID_CLOSED. I believe this means that if you specify anything with wildcard, closed will be ignored. This will then be identical in a mixed cluster (all good).

But if you specify a specific index with no wildcards, the semantics of STRICT_EXPAND_OPEN_FORBID_CLOSED is to throw an exception if the index is closed.

On 8.0, this should obviously be changed like you did, ie. allow both open and closed.

But in a mixed cluster, a request specifying a specific closed index can either hit a 7.x node or an 8.x node (using REST or transport). If it hits a 7.0 node, the request will fail, if it hits an 8.0 node, it will succeed with an empty response. This could seem a bit random to someone behind a load balancer.

}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,11 +112,11 @@ protected ShardsIterator shards(ClusterState state, RecoveryRequest request, Str

@Override
protected ClusterBlockException checkGlobalBlock(ClusterState state, RecoveryRequest request) {
return state.blocks().globalBlockedException(ClusterBlockLevel.READ);
return state.blocks().globalBlockedException(ClusterBlockLevel.METADATA_READ);
}

@Override
protected ClusterBlockException checkRequestBlock(ClusterState state, RecoveryRequest request, String[] concreteIndices) {
return state.blocks().indicesBlockedException(ClusterBlockLevel.READ, concreteIndices);
return state.blocks().indicesBlockedException(ClusterBlockLevel.METADATA_READ, concreteIndices);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,11 @@ protected BroadcastRequest(String[] indices) {
this.indices = indices;
}

protected BroadcastRequest(String[] indices, IndicesOptions indicesOptions) {
this.indices = indices;
this.indicesOptions = indicesOptions;
}

@Override
public String[] indices() {
return indices;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@
import org.elasticsearch.plugins.Plugin;
import org.elasticsearch.snapshots.Snapshot;
import org.elasticsearch.snapshots.SnapshotState;
import org.elasticsearch.test.BackgroundIndexer;
import org.elasticsearch.test.ESIntegTestCase;
import org.elasticsearch.test.ESIntegTestCase.ClusterScope;
import org.elasticsearch.test.ESIntegTestCase.Scope;
Expand Down Expand Up @@ -209,24 +210,34 @@ public void testGatewayRecoveryTestActiveOnly() throws Exception {
}

public void testReplicaRecovery() throws Exception {
logger.info("--> start node A");
String nodeA = internalCluster().startNode();
final String nodeA = internalCluster().startNode();
createIndex(INDEX_NAME, Settings.builder()
.put(IndexMetaData.SETTING_NUMBER_OF_SHARDS, SHARD_COUNT)
.put(IndexMetaData.SETTING_NUMBER_OF_REPLICAS, REPLICA_COUNT)
.build());
ensureGreen(INDEX_NAME);

final int numOfDocs = scaledRandomIntBetween(0, 200);
try (BackgroundIndexer indexer = new BackgroundIndexer(INDEX_NAME, "_doc", client(), numOfDocs)) {
waitForDocs(numOfDocs, indexer);
}

logger.info("--> create index on node: {}", nodeA);
createAndPopulateIndex(INDEX_NAME, 1, SHARD_COUNT, REPLICA_COUNT);
refresh(INDEX_NAME);
assertHitCount(client().prepareSearch(INDEX_NAME).setSize(0).get(), numOfDocs);

logger.info("--> start node B");
String nodeB = internalCluster().startNode();
ensureGreen();
final boolean closedIndex = randomBoolean();
if (closedIndex) {
assertAcked(client().admin().indices().prepareClose(INDEX_NAME));
ensureGreen(INDEX_NAME);
}

// force a shard recovery from nodeA to nodeB
logger.info("--> bump replica count");
client().admin().indices().prepareUpdateSettings(INDEX_NAME)
.setSettings(Settings.builder().put("number_of_replicas", 1)).execute().actionGet();
ensureGreen();
final String nodeB = internalCluster().startNode();
assertAcked(client().admin().indices().prepareUpdateSettings(INDEX_NAME)
.setSettings(Settings.builder().put(IndexMetaData.SETTING_NUMBER_OF_REPLICAS, 1)));
ensureGreen(INDEX_NAME);

logger.info("--> request recoveries");
RecoveryResponse response = client().admin().indices().prepareRecoveries(INDEX_NAME).execute().actionGet();
final RecoveryResponse response = client().admin().indices().prepareRecoveries(INDEX_NAME).execute().actionGet();

// we should now have two total shards, one primary and one replica
List<RecoveryState> recoveryStates = response.shardRecoveryStates().get(INDEX_NAME);
Expand All @@ -238,14 +249,27 @@ public void testReplicaRecovery() throws Exception {
assertThat(nodeBResponses.size(), equalTo(1));

// validate node A recovery
RecoveryState nodeARecoveryState = nodeAResponses.get(0);
assertRecoveryState(nodeARecoveryState, 0, RecoverySource.EmptyStoreRecoverySource.INSTANCE, true, Stage.DONE, null, nodeA);
final RecoveryState nodeARecoveryState = nodeAResponses.get(0);
final RecoverySource expectedRecoverySource;
if (closedIndex == false) {
expectedRecoverySource = RecoverySource.EmptyStoreRecoverySource.INSTANCE;
} else {
expectedRecoverySource = RecoverySource.ExistingStoreRecoverySource.INSTANCE;
}
assertRecoveryState(nodeARecoveryState, 0, expectedRecoverySource, true, Stage.DONE, null, nodeA);
validateIndexRecoveryState(nodeARecoveryState.getIndex());

// validate node B recovery
RecoveryState nodeBRecoveryState = nodeBResponses.get(0);
final RecoveryState nodeBRecoveryState = nodeBResponses.get(0);
assertRecoveryState(nodeBRecoveryState, 0, PeerRecoverySource.INSTANCE, false, Stage.DONE, nodeA, nodeB);
validateIndexRecoveryState(nodeBRecoveryState.getIndex());

internalCluster().stopRandomNode(InternalTestCluster.nameFilter(nodeA));

if (closedIndex) {
assertAcked(client().admin().indices().prepareOpen(INDEX_NAME));
}
assertHitCount(client().prepareSearch(INDEX_NAME).setSize(0).get(), numOfDocs);
}

@TestLogging(
Expand Down