Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change certain replica failures not to fail the replica shard #22874

Merged

Conversation

dakrone
Copy link
Member

@dakrone dakrone commented Jan 30, 2017

This changes the way that replica failures are handled such that not all
failures will cause the replica shard to be failed or marked as stale.

In some cases such as refresh operations, or global checkpoint syncs, it is
"okay" for the operation to fail without the shard being failed (because no data
is out of sync). In these cases, instead of failing the shard we should simply
fail the operation, and, in the event it is a user-facing operation, return a
5xx response code including the shard-specific failures.

This was accomplished by having two forms of the Replicas proxy, one that is
for non-write operations that does not fail the shard, and one that is for write
operations that will fail the shard when an operation fails.

Relates to #10708

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @dakrone. I left a bunch of initial comments. Maybe I miss something obvious but where is the logic to respond with an error if one of the shard copies fail (except in the TransportWriteAction)?

ReplicationOperation.this::onPrimaryDemoted,
throwable -> decPendingAndFinishIfNeeded()
);
final boolean shardWasFailed = replicasProxy.failShardIfNeeded(shard, replicaRequest.primaryTerm(), message,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not introduce more code paths. I would vote to always have the callback pattern. If people have nothing to do they can call onSuccess on the same thread.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I pushed 3589d9c39d7030b9d7817764626520cb976ba9b2 that switches this to use the callback

@@ -209,14 +209,20 @@ public void onFailure(Exception replicaException) {
shardReplicaFailures.add(new ReplicationResponse.ShardInfo.Failure(
shard.shardId(), shard.currentNodeId(), replicaException, restStatus, false));
String message = String.format(Locale.ROOT, "failed to perform %s on replica %s", opType, shard);
// TODO: Boaz wanted to get rid of this warning, should we?
logger.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we plan to use this for background operations like the global check point sync, where it's not a big deal if this happen. I don't want the logs to have warning in them. Instead, implementations (i.e., TransportWriteAction) of failShardIfNeeded can log a warning if they are going to fail the shard

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I pushed ccda991ee569490ce2ced3f5bff7763e867e20ac

* interface that performs the actual {@code ReplicaRequest} on the replica
* shards. It also encapsulates the logic required for failing the replica
* if deemed necessary as well as marking it as stale when needed.
*/
final class ReplicasProxy implements ReplicationOperation.Replicas<ReplicaRequest> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why make this final? WriteReplicasProxy can share a lot of it's logic with this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had them sharing knowledge (subclassing), however, ReplicasProxy used parts of TransportReplicationAction and made WriteActionReplicasProxy not able to subclass it well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work fine for me?

diff --git a/core/src/main/java/org/elasticsearch/action/support/replication/TransportReplicationAction.java b/core/src/main/java/org/elasticsearch/action/support/replication/TransportReplicationAction.java
index db037ad..2cd1d8e 100644
--- a/core/src/main/java/org/elasticsearch/action/support/replication/TransportReplicationAction.java
+++ b/core/src/main/java/org/elasticsearch/action/support/replication/TransportReplicationAction.java
@@ -1040,7 +1040,7 @@ public abstract class TransportReplicationAction<
      * shards. It also encapsulates the logic required for failing the replica
      * if deemed necessary as well as marking it as stale when needed.
      */
-    final class ReplicasProxy implements ReplicationOperation.Replicas<ReplicaRequest> {
+    class ReplicasProxy implements ReplicationOperation.Replicas<ReplicaRequest> {
 
         @Override
         public void performOn(ShardRouting replica, ReplicaRequest request, ActionListener<ReplicationOperation.ReplicaResponse> listener) {
diff --git a/core/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java b/core/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java
index ac79a8c..8a168ca 100644
--- a/core/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java
+++ b/core/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java
@@ -74,7 +74,7 @@ public abstract class TransportWriteAction<
 
     @Override
     protected ReplicationOperation.Replicas newReplicasProxy() {
-        return new WriteActionReplicasProxy(shardStateAction);
+        return new WriteActionReplicasProxy();
     }
 
     /**
@@ -335,26 +335,7 @@ public abstract class TransportWriteAction<
      * replicas, where a failure to execute the operation should fail
      * the replica shard and/or mark the replica as stale.
      */
-    final class WriteActionReplicasProxy implements ReplicationOperation.Replicas<ReplicaRequest> {
-
-        private final ShardStateAction shardStateAction;
-
-        WriteActionReplicasProxy(ShardStateAction shardStateAction) {
-            this.shardStateAction = shardStateAction;
-        }
-
-        @Override
-        public void performOn(ShardRouting replica, ReplicaRequest request, ActionListener<ReplicationOperation.ReplicaResponse> listener) {
-            String nodeId = replica.currentNodeId();
-            final DiscoveryNode node = clusterService.state().nodes().get(nodeId);
-            if (node == null) {
-                listener.onFailure(new NoNodeAvailableException("unknown node [" + nodeId + "]"));
-                return;
-            }
-            final ConcreteShardRequest<ReplicaRequest> concreteShardRequest =
-                    new ConcreteShardRequest<>(request, replica.allocationId().getId());
-            sendReplicaRequest(concreteShardRequest, node, listener);
-        }
+    class WriteActionReplicasProxy extends ReplicasProxy {
 
         @Override
         public void failShardIfNeeded(ShardRouting replica, long primaryTerm, String message, Exception exception,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bleskes that's actually a scary thing, that "works" for Intellij's compiler (it reports no compilation errors), but if you actually run gradle compileJava:

/home/hinmanm/es/elasticsearch/core/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java:347: error: name clash: failShardIfNeeded(ShardRouting,long,String,Exception,Runnable,Consumer<Exception>,Consumer<Exception>) in TransportWriteAction.WriteActionReplicasProxy and failShardIfNeeded(ShardRouting,long,String,Exception,Runnable,Consumer<Exception>,Consumer<Exception>) in TransportReplicationAction.ReplicasProxy have the same erasure, yet neither overrides the other
        public void failShardIfNeeded(ShardRouting replica, long primaryTerm, String message, Exception exception,
                    ^
/home/hinmanm/es/elasticsearch/core/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java:346: error: method does not override or implement a method from a supertype
        @Override
        ^

It fails because of type erasure (I think due to the crazy generics that are in use), I tried a lot of different ways to get around it but I wasn't able to find a way.

private final ClusterService clusterService;
private final TransportReplicationAction replicationAction;

ReplicasProxy(ClusterService clusterService, TransportReplicationAction replicationAction) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an inner class anyway, why pass these along? If you want to make it a static inner class, I'm good with it, although I'm not sure if it's worth the extra clutter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally tried to factor this out into a non-inner class, but turned out that the generics prevented it, so I will removed these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed b5e91badc401a5aab1f20209ce0096881b07992f

*/
void failShard(ShardRouting replica, long primaryTerm, String message, Exception exception, Runnable onSuccess,
Consumer<Exception> onPrimaryDemoted, Consumer<Exception> onIgnoredFailure);
void markShardCopyAsStaleIfNeeded(ShardId shardId, String allocationId, long primaryTerm, Runnable onSuccess,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need two of these? can't we have just markShardCopyAsStaleIfNeeded? we can rename markUnavailableShardsAsStale to markUnavailableShardsAsStaleIfNeeded

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need two because there are situations when a replica shard needs to be marked as stale without a write action, I originally had this and then the tests that restart ES exposed that they still need to be able to mark it as stale without any actions.

Consumer<Exception> onPrimaryDemoted, Consumer<Exception> onIgnoredFailure) {
// Just like with failing the shard, it should only be marked as
// stale when write operations fail, so for regular operations
// don't mark it as stale when not necessary.
}

@Override
public void markShardCopyAsStale(ShardId shardId, String allocationId, long primaryTerm, Runnable onSuccess,
Consumer<Exception> onPrimaryDemoted, Consumer<Exception> onIgnoredFailure) {
shardStateAction.remoteShardFailed(shardId, allocationId, primaryTerm, "mark copy as stale", null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic belongs in transportWriteAction

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in my previous comment, we still need a way to mark the replica as stale when the shard is unavailable.

What we don't need is the "if needed" version, because we only do this when the shard is not available, not when an operation fails.

I've removed the "if needed" version of this in 1f58744362b756bc9ef6f6bfb412c2836890d8f7

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need two because there are situations when a replica shard needs to be marked as stale without a write action

Which situations do you mean?

public class GlobalCheckpointSyncAction extends TransportReplicationAction<GlobalCheckpointSyncAction.PrimaryRequest,
GlobalCheckpointSyncAction.ReplicaRequest, ReplicationResponse> {
public class GlobalCheckpointSyncAction extends TransportReplicationAction<GlobalCheckpointSyncAction.GlobalCheckpointPrimaryRequest,
GlobalCheckpointSyncAction.GlobalCheckpointReplicaRequest, ReplicationResponse> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why rename this , it's already scoped under GlobalCheckpointSyncAction. it gets so long...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 68d0ddd456d5e42db40c06261af8e8a19c201533 for this.

@dakrone
Copy link
Member Author

dakrone commented Feb 1, 2017

Maybe I miss something obvious but where is the logic to respond with an error if one of the shard copies fail (except in the TransportWriteAction)?

That logic is in the BroadcastShardResponse now, (getStatus()), which is used on the REST side. The error reporting was already there, this just changes the response to be a 500 instead of 200 in the event there is an error.

@dakrone
Copy link
Member Author

dakrone commented Feb 2, 2017

Thanks @bleskes, I think I pushed commits and added comments for all of your feedback so far.

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @dakrone . I responded.

@@ -71,6 +74,17 @@ public int getFailedShards() {
}

/**
* The REST status that should be used for the response
*/
public RestStatus getStatus() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is effects all response that inherit from BroadcastResponse - can you check that all of them do the right thing and filter shard not available failures? might be good to add an assertion in the constructors, looking at all the exceptions this get.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, I added an assert in f79c0f724819ff428c7c2c1d9bc2701521d96a81

*/
public RestStatus getStatus() {
if (failedShards > 0) {
return RestStatus.INTERNAL_SERVER_ERROR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we make an heroic effort to take the worst of all underlying rest status? maybe just the first ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we make an heroic effort to take the worst of all underlying rest status? maybe just the first ?

I don't know, how do you define "worst"? Is a 403 worse than a 500 or is a 500 worse? We can totally go with first though, if that's desired.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 5xx (server error) is worst than 4xx, but let's just go with the first.

Request extends ReplicationRequest<Request>,
ReplicaRequest extends ReplicationRequest<ReplicaRequest>,
PrimaryResultT extends PrimaryResult<ReplicaRequest>
R extends ReplicationRequest<R>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why do we call this R and use an T suffix on the rest?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen two standard "formats", one is the single char format, which the JDK uses a lot like Collection<E>, so that's why the R. For the others, PrimaryResult as PR I could do, but I figured since it wasn't a single char I'd suffix it with T for type (another format I've seen in codebases).

I'm happy to change it to R, and PR if you think that would be clearer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would go with a T suffix - i.e., RequestT

return new ReplicationOperation<>(request, primaryShardReference, listener,
executeOnReplicas, replicasProxy, clusterService::state, logger, actionName
);
return new ReplicationOperation<Request, ReplicaRequest, PrimaryResult<ReplicaRequest, Response>>(request,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity - why does <> not work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added unnecessarily, I've removed it in 5eafb1866e85b994fafd178d72c873bba9838e71

* interface that performs the actual {@code ReplicaRequest} on the replica
* shards. It also encapsulates the logic required for failing the replica
* if deemed necessary as well as marking it as stale when needed.
*/
final class ReplicasProxy implements ReplicationOperation.Replicas<ReplicaRequest> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work fine for me?

diff --git a/core/src/main/java/org/elasticsearch/action/support/replication/TransportReplicationAction.java b/core/src/main/java/org/elasticsearch/action/support/replication/TransportReplicationAction.java
index db037ad..2cd1d8e 100644
--- a/core/src/main/java/org/elasticsearch/action/support/replication/TransportReplicationAction.java
+++ b/core/src/main/java/org/elasticsearch/action/support/replication/TransportReplicationAction.java
@@ -1040,7 +1040,7 @@ public abstract class TransportReplicationAction<
      * shards. It also encapsulates the logic required for failing the replica
      * if deemed necessary as well as marking it as stale when needed.
      */
-    final class ReplicasProxy implements ReplicationOperation.Replicas<ReplicaRequest> {
+    class ReplicasProxy implements ReplicationOperation.Replicas<ReplicaRequest> {
 
         @Override
         public void performOn(ShardRouting replica, ReplicaRequest request, ActionListener<ReplicationOperation.ReplicaResponse> listener) {
diff --git a/core/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java b/core/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java
index ac79a8c..8a168ca 100644
--- a/core/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java
+++ b/core/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java
@@ -74,7 +74,7 @@ public abstract class TransportWriteAction<
 
     @Override
     protected ReplicationOperation.Replicas newReplicasProxy() {
-        return new WriteActionReplicasProxy(shardStateAction);
+        return new WriteActionReplicasProxy();
     }
 
     /**
@@ -335,26 +335,7 @@ public abstract class TransportWriteAction<
      * replicas, where a failure to execute the operation should fail
      * the replica shard and/or mark the replica as stale.
      */
-    final class WriteActionReplicasProxy implements ReplicationOperation.Replicas<ReplicaRequest> {
-
-        private final ShardStateAction shardStateAction;
-
-        WriteActionReplicasProxy(ShardStateAction shardStateAction) {
-            this.shardStateAction = shardStateAction;
-        }
-
-        @Override
-        public void performOn(ShardRouting replica, ReplicaRequest request, ActionListener<ReplicationOperation.ReplicaResponse> listener) {
-            String nodeId = replica.currentNodeId();
-            final DiscoveryNode node = clusterService.state().nodes().get(nodeId);
-            if (node == null) {
-                listener.onFailure(new NoNodeAvailableException("unknown node [" + nodeId + "]"));
-                return;
-            }
-            final ConcreteShardRequest<ReplicaRequest> concreteShardRequest =
-                    new ConcreteShardRequest<>(request, replica.allocationId().getId());
-            sendReplicaRequest(concreteShardRequest, node, listener);
-        }
+    class WriteActionReplicasProxy extends ReplicasProxy {
 
         @Override
         public void failShardIfNeeded(ShardRouting replica, long primaryTerm, String message, Exception exception,

Consumer<Exception> onPrimaryDemoted, Consumer<Exception> onIgnoredFailure) {
// Just like with failing the shard, it should only be marked as
// stale when write operations fail, so for regular operations
// don't mark it as stale when not necessary.
}

@Override
public void markShardCopyAsStale(ShardId shardId, String allocationId, long primaryTerm, Runnable onSuccess,
Consumer<Exception> onPrimaryDemoted, Consumer<Exception> onIgnoredFailure) {
shardStateAction.remoteShardFailed(shardId, allocationId, primaryTerm, "mark copy as stale", null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need two because there are situations when a replica shard needs to be marked as stale without a write action

Which situations do you mean?

@@ -124,7 +138,11 @@ public synchronized void respond(ActionListener<Response> listener) {
protected void respondIfPossible(Exception ex) {
if (finishedAsyncActions && listener != null) {
if (ex == null) {
super.respond(listener);
if (finalResponseIfSuccessful != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a leftover from me trying something out, I've removed it in 56cc63f612e5281555d5da1202dc9a44748eceea

@@ -129,7 +129,7 @@ public String toString() {
private ReplicaRequest() {
}

public ReplicaRequest(PrimaryRequest primaryRequest, long checkpoint) {
public ReplicaRequest(ReplicationRequest primaryRequest, long checkpoint) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I manually reverted the changes to GlobalCheckpointSyncAction and I missed one :)

I undid it in f7b29d78395c2e13c715f6c167d6e8cbffe28e6e

@@ -98,7 +98,7 @@ public RestResponse buildResponse(FieldStatsResponse response, XContentBuilder b
builder.endObject();
}
builder.endObject();
return new BytesRestResponse(RestStatus.OK, builder);
return new BytesRestResponse(response.getStatus(), builder);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are not related changes - can we maybe move them to a separate PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, though I think it makes it more consistent to change these to be the same, I'll open a separate PR for it after this one to change them to be 500 if a shard fails and remove all of these except for the refresh action

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed d16441b19d606d87166d82345c1197f7fa6d8386 for this

@dakrone
Copy link
Member Author

dakrone commented Feb 3, 2017

@bleskes I think I addressed all the feedback for this round, thanks again for taking a look!

@dakrone dakrone force-pushed the replica-failures-dont-always-fail-shard branch from d16441b to d7b277d Compare February 3, 2017 17:11
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some comments that you can adopt or not. No need for another review. Thx @dakrone

*/
public RestStatus getStatus() {
if (failedShards > 0) {
return RestStatus.INTERNAL_SERVER_ERROR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 5xx (server error) is worst than 4xx, but let's just go with the first.

Request extends ReplicationRequest<Request>,
ReplicaRequest extends ReplicationRequest<ReplicaRequest>,
PrimaryResultT extends PrimaryResult<ReplicaRequest>
R extends ReplicationRequest<R>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would go with a T suffix - i.e., RequestT

private final TransportRequestOptions transportOptions;
private final String executor;

// package private for testing
private final String transportReplicaAction;
private final String transportPrimaryAction;
private final ReplicasProxy replicasProxy;
private final ReplicationOperation.Replicas replicasProxy;

protected TransportReplicationAction(Settings settings, String actionName, TransportService transportService,
ClusterService clusterService, IndicesService indicesService,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove the shard state action here, no?

return indexService;
}

final IndicesService mockIndicesService(ClusterService clusterService) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we share this and mockIndexShard with TransportReplicationAction? is there any point in that?

@dakrone dakrone force-pushed the replica-failures-dont-always-fail-shard branch 2 times, most recently from 24ddcb0 to 8661750 Compare February 3, 2017 21:03
@dakrone
Copy link
Member Author

dakrone commented Feb 3, 2017

Thank you very much for reviewing this @bleskes!

@dakrone dakrone force-pushed the replica-failures-dont-always-fail-shard branch from 8661750 to 7af97f5 Compare February 3, 2017 21:38
This changes the way that replica failures are handled such that not all
failures will cause the replica shard to be failed or marked as stale.

In some cases such as refresh operations, or global checkpoint syncs, it is
"okay" for the operation to fail without the shard being failed (because no data
is out of sync). In these cases, instead of failing the shard we should simply
fail the operation, and, in the event it is a user-facing operation, return a
5xx response code including the shard-specific failures.

This was accomplished by having two forms of the `Replicas` proxy, one that is
for non-write operations that does not fail the shard, and one that is for write
operations that will fail the shard when an operation fails.

Relates to elastic#10708
@dakrone dakrone force-pushed the replica-failures-dont-always-fail-shard branch from 7af97f5 to 39e7c30 Compare February 3, 2017 21:39
@dakrone dakrone merged commit 39e7c30 into elastic:master Feb 3, 2017
@dakrone dakrone deleted the replica-failures-dont-always-fail-shard branch December 13, 2017 20:37
@clintongormley clintongormley added :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Sequence IDs labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. v6.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants