Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-10875. XceiverRatisServer#getRaftPeersInPipeline should be called before XceiverRatisServer#removeGroup #6696

Merged
merged 8 commits into from
May 21, 2024

Conversation

ivandika3
Copy link
Contributor

@ivandika3 ivandika3 commented May 18, 2024

What changes were proposed in this pull request?

From the https://issues.apache.org/jira/browse/HDDS-10750?focusedCommentId=17847435&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17847435 in HDDS-10750, it's found GroupMismatchException are thrown during the ClosePipelineCommandHandler.

This is because XceiverRatisServer#removeGroup is called before XceiverRatisServer#getRaftPeersInPipeline, which causes XceiverRatisServer#getRaftPeersInPipeline to throw GroupMismatchException when it's trying to get the RaftServerProxy#getDivision since the group has been removed.

Therefore, we need to first call the XceiverRatisServer#getRaftPeersInPipeline before calling XceiverRatisServer#removeGroup.

This patch also catch the GroupMismatchException in case the group has been removed by earlier ClosePipelineCommandHandler in other datanode for the same pipeline. The datanode will also try to remove the Ratis group from the other datanodes (ignoring the GroupMismatchException) before removing its own Ratis group.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10875

How was this patch tested?

Clean CI: https://github.com/ivandika3/ozone/actions/runs/9145754999

@ivandika3 ivandika3 changed the title HDDS-10875. GroupMismatchException in ClosePipelineCommandHandler should be ingnored HDDS-10875. XceiverRatisServer#getRaftPeersInPipeline should be called before XceiverRatisServer#removeGroup May 18, 2024
Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivandika3 , thanks for working on this! How about catching GroupMismatchException in the existing try-catch?

+++ b/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/ClosePipelineCommandHandler.java
@@ -105,7 +105,6 @@ public void handle(SCMCommand command, OzoneContainer ozoneContainer,
       try {
         XceiverServerSpi server = ozoneContainer.getWriteChannel();
         if (server.isExist(pipelineIdProto)) {
-          server.removeGroup(pipelineIdProto);
           if (server instanceof XceiverServerRatis) {
             // TODO: Refactor Ratis logic to XceiverServerRatis
             // Propagate the group remove to the other Raft peers in the pipeline
@@ -127,12 +126,18 @@ public void handle(SCMCommand command, OzoneContainer ozoneContainer,
                   }
                 });
           }
+          server.removeGroup(pipelineIdProto);
           LOG.info("Close Pipeline {} command on datanode {}.", pipelineID,
               dn.getUuidString());
         } else {
           LOG.debug("Ignoring close pipeline command for pipeline {} " +
               "as it does not exist", pipelineID);
         }
+      } catch (GroupMismatchException gme) {
+        // ignore silently since this means that the group has been closed by earlier close pipeline
+        // command in another datanode
+        LOG.debug("The Ratis group for the pipeline {} has been removed by earlier close pipeline command from " +
+            "other datanodes", pipelineID.getId());
       } catch (IOException e) {
         LOG.error("Can't close pipeline {}", pipelineID, e);
       } finally {

@ivandika3
Copy link
Contributor Author

Thank you for the review @szetszwo.

How about catching GroupMismatchException in the existing try-catch?

Sure, updated. Was using a nested try-catch since GroupMismatchException is an Ratis-specific exception.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

@ivandika3
Copy link
Contributor Author

ivandika3 commented May 19, 2024

In XceiverServerRatis#removeGroup, it seems the Ratis exception during group remove (e.g. GroupMismatchException) is wrapped in IOException, not sure why.

    RaftClientReply reply;
    try {
      reply = server.groupManagement(request);
    } catch (Exception e) {
      throw new IOException(e.getMessage(), e);
    }

Currently, I'm using HddsClientUtils.containsException to extract GroupMismatchException from the IOException. However, we can also remove the IOException wrapping as well.

@ivandika3
Copy link
Contributor Author

ivandika3 commented May 19, 2024

Found RaftRetryFailureException due to org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception. Since the current Raft client policy is from RequestTypeDependentRetryPolicyCreator, it seems the StatusRuntimeException retry policy is MultipleLinearRandomRetry (others) which might block the datanode when it's trying to remove group from other datanode due to connection timeout.

024-05-19 04:08:53,822 [f3d35daa-4aaf-4c71-973e-2392b4bdb0cc-PipelineCommandHandlerThread-0] WARN  commandhandler.ClosePipelineCommandHandler (ClosePipelineCommandHandler.java:lambda$null$1(131)) - Failed to remove group group-72E5E9088A19 of pipeline PipelineID=8643905b-c6aa-4c00-96b1-72e5e9088a19 on peer de8c5e91-909e-46c9-a24e-1aa548fa4b98
org.apache.ratis.protocol.exceptions.RaftRetryFailureException: Failed GroupManagementRequest:client-363EA46DA04D->de8c5e91-909e-46c9-a24e-1aa548fa4b98@group-72E5E9088A19, cid=120, seq=null, RW, null, Remove:group-72E5E9088A19, delete-dir for 25 attempts with RequestTypeDependentRetryPolicy{WRITE->ExceptionDependentRetry(maxAttempts=2147483647; defaultPolicy=MultipleLinearRandomRetry[5x5s, 5x10s, 5x15s, 5x20s, 5x25s, 10x60s]; map={org.apache.ratis.protocol.exceptions.GroupMismatchException->NoRetry, org.apache.ratis.protocol.exceptions.NotReplicatedException->NoRetry, org.apache.ratis.protocol.exceptions.ResourceUnavailableException->org.apache.ratis.retry.ExponentialBackoffRetry@313565ca, org.apache.ratis.protocol.exceptions.StateMachineException->NoRetry, org.apache.ratis.protocol.exceptions.TimeoutIOException->org.apache.ratis.retry.ExponentialBackoffRetry@313565ca}), WATCH->ExceptionDependentRetry(maxAttempts=2147483647; defaultPolicy=MultipleLinearRandomRetry[5x5s, 5x10s, 5x15s, 5x20s, 5x25s, 10x60s]; map={org.apache.ratis.protocol.exceptions.GroupMismatchException->NoRetry, org.apache.ratis.protocol.exceptions.NotReplicatedException->NoRetry, org.apache.ratis.protocol.exceptions.ResourceUnavailableException->org.apache.ratis.retry.ExponentialBackoffRetry@313565ca, org.apache.ratis.protocol.exceptions.StateMachineException->NoRetry, org.apache.ratis.protocol.exceptions.TimeoutIOException->NoRetry})}
	at org.apache.ratis.client.impl.RaftClientImpl.noMoreRetries(RaftClientImpl.java:353)
	at org.apache.ratis.client.impl.BlockingImpl.sendRequestWithRetry(BlockingImpl.java:129)
	at org.apache.ratis.client.impl.GroupManagementImpl.remove(GroupManagementImpl.java:61)
	at org.apache.hadoop.ozone.container.common.statemachine.commandhandler.ClosePipelineCommandHandler.lambda$null$1(ClosePipelineCommandHandler.java:123)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
	at java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1652)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
	at org.apache.hadoop.ozone.container.common.statemachine.commandhandler.ClosePipelineCommandHandler.lambda$handle$2(ClosePipelineCommandHandler.java:120)
	at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
	at org.apache.ratis.grpc.GrpcUtil.unwrapException(GrpcUtil.java:99)
	at org.apache.ratis.grpc.client.GrpcClientProtocolClient.blockingCall(GrpcClientProtocolClient.java:223)
	at org.apache.ratis.grpc.client.GrpcClientProtocolClient.groupAdd(GrpcClientProtocolClient.java:170)
	at org.apache.ratis.grpc.client.GrpcClientRpc.sendRequest(GrpcClientRpc.java:98)
	at org.apache.ratis.client.impl.BlockingImpl.sendRequest(BlockingImpl.java:145)
	at org.apache.ratis.client.impl.BlockingImpl.sendRequestWithRetry(BlockingImpl.java:109)
	... 16 more
Caused by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
	at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:268)
	at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:249)
	at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:167)
	at org.apache.ratis.proto.grpc.AdminProtocolServiceGrpc$AdminProtocolServiceBlockingStub.groupManagement(AdminProtocolServiceGrpc.java:468)
	at org.apache.ratis.grpc.client.GrpcClientProtocolClient.lambda$groupAdd$5(GrpcClientProtocolClient.java:172)
	at org.apache.ratis.grpc.client.GrpcClientProtocolClient.blockingCall(GrpcClientProtocolClient.java:221)
	... 20 more
Caused by: org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: /10.1.0.100:15049
Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
	at org.apache.ratis.thirdparty.io.netty.channel.unix.Errors.newConnectException0(Errors.java:166)
	at org.apache.ratis.thirdparty.io.netty.channel.unix.Errors.handleConnectErrno(Errors.java:131)
	at org.apache.ratis.thirdparty.io.netty.channel.unix.Socket.finishConnect(Socket.java:359)
	at org.apache.ratis.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:710)
	at org.apache.ratis.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:687)
	at org.apache.ratis.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567)
	at org.apache.ratis.thirdparty.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499)
	at org.apache.ratis.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:407)
	at org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at org.apache.ratis.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at org.apache.ratis.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:750)

Added a new RatisHelper#newRaftClientNoRetry with RetryPolicies.noRetry() so that group remove operation on other datanodes won't block the group remove on the datanode itself.

@ivandika3 ivandika3 marked this pull request as ready for review May 20, 2024 01:21
@ivandika3
Copy link
Contributor Author

@szetszwo I have made some changes on the patch. Could you help review this again? Thank you.

@ivandika3 ivandika3 requested review from szetszwo and duongkame May 20, 2024 07:31
Comment on lines 146 to 150
} catch (GroupMismatchException gme) {
// ignore silently since this means that the group has been closed by earlier close pipeline
// command in another datanode
LOG.debug("The group for pipeline {} on datanode {} has been removed by earlier close " +
"pipeline command handled in another datanode", pipelineID, dn.getUuidString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivandika3 , Since HddsClientUtils.containsException below will cover this case, let's remove catch (GroupMismatchException gme) {...}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review. Updated.

@szetszwo
Copy link
Contributor

+1 to the latest change.

@ivandika3 ivandika3 merged commit 87c3945 into apache:master May 21, 2024
39 checks passed
@ivandika3
Copy link
Contributor Author

Thank you for the reviews @szetszwo , merged.

@ivandika3 ivandika3 self-assigned this May 23, 2024
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request May 23, 2024
errose28 added a commit to errose28/ozone that referenced this pull request May 28, 2024
…concile-cli

* HDDS-10239-container-reconciliation: (296 commits)
  HDDS-10897. Refactor OzoneQuota (apache#6714)
  HDDS-10422. Fix some warnings about exposing internal representation in hdds-common (apache#6351)
  HDDS-10899. Refactor Lease callbacks (apache#6715)
  HDDS-10890. Increase default value for hdds.container.ratis.log.appender.queue.num-elements (apache#6711)
  HDDS-10832. Client should switch to streaming based on OpenKeySession replication (apache#6683)
  HDDS-10435. Support S3 object tags for existing requests (apache#6607)
  HDDS-10883. Improve logging in Recon for finalising DN logic. (apache#6704)
  HDDS-8752. Enable TestOzoneRpcClientAbstract#testOverWriteKeyWithAndWithOutVersioning (apache#6702)
  HDDS-10875. XceiverRatisServer#getRaftPeersInPipeline should be called before XceiverRatisServer#removeGroup (apache#6696)
  HDDS-10514. Recon - Provide DN decommissioning detailed status and info inline with current CLI command output. (apache#6376)
  HDDS-10878. Bump zstd-jni to 1.5.6-3 (apache#6701)
  HDDS-10877. Bump Dropwizard metrics to 3.2.6 (apache#6699)
  HDDS-10876. Bump jackson to 2.16.2 (apache#6697)
  HDDS-6116. Remove flaky tag from TestSCMInstallSnapshot (apache#6695)
  HDDS-2643. TestOzoneDelegationTokenSecretManager#testRenewTokenFailureRenewalTime fails intermittently.
  HDDS-10699. Refactor ContainerBalancerTask and TestContainerBalancerTask (apache#6537)
  HDDS-10861. Ozone cli supports default ozone.om.service.id (apache#6680)
  HDDS-10859. Improve error messages when decommission and maintenance fail-early (apache#6678)
  HDDS-9031. Upgrade acceptance tests to Docker Compose v2 (apache#6667)
  HDDS-10559. Add a warning or a check to run repair tool as System user (apache#6574)
  ...

Conflicts:
    hadoop-ozone/dist/src/main/smoketest/admincli/container.robot
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
…d before XceiverRatisServer#removeGroup (apache#6696)

(cherry picked from commit 87c3945)
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
…d before XceiverRatisServer#removeGroup (apache#6696)

(cherry picked from commit 87c3945)
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
…d before XceiverRatisServer#removeGroup (apache#6696)

(cherry picked from commit 87c3945)
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
…d before XceiverRatisServer#removeGroup (apache#6696)

(cherry picked from commit 87c3945)
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
…d before XceiverRatisServer#removeGroup (apache#6696)

(cherry picked from commit 87c3945)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants