HDDS-12483. Quasi Closed Stuck should have 2 replicas of each origin #8014

sodonnel · 2025-03-05T17:47:03Z

What changes were proposed in this pull request?

After the change to ensure Quasi Closed replicas only move to CLOSED when all 3 origin are present, some new edge cases appeared where some replicas do not get full replication.

After some discussion we decided on the following for Quasi Closed Stuck containers:

If there is only 1 origin available, create 3 copies of it.
If there are 2 or more origins, maintain 2 copies of each origin.

In the worse case, which is:

Origin: 1, bcsid:10, State: Quasi_Closed
Origin: 2, bcsid:10, State: Quasi_Closed
Origin: 3, bcsid:11, State: Unhealthy

This cannot move to closed as the highest BCSID is unhealthy, so 6 replicas will be maintained in the system - 2 from each origin.

This PR solves this issue by introducing a new replication check handler for QC Stuck containers which runs before the existing Ratis Under Replicated Handler.

When under replication is identified a new QuasiClosedStuckUnderReplicationHandler is introduced to process it separately from the existing under replication flow.

Implementing in this way, isolates the special handling into their own code units, and avoids making changing to existing handles which could result in unexpected side effects in complex areas.

Note a followup PR will be required for Over Replication handling.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12483

How was this patch tested?

New unit and RM scenario tests added / modified.

… copies needed

siddhantsangwan · 2025-03-07T12:12:35Z

...rg/apache/hadoop/hdds/scm/container/replication/health/QuasiClosedStuckReplicationCheck.java

+    if (hasEnoughOriginsWithOpen(containerInfo, replicas)) {
+      // If we have all origins with open replicas, and not unhealthy then the container should close after the close
+      // goes through, so this handler should not run.
+      return false;


I need some more clarity on this one. Are we waiting for the open replicas to move to quasi_closed first? So there's a mix of open and quasi_closed replicas right now?

hasEnoughOriginsWithOpen also counts quasi_closed replicas, not clear about that as well.

My intended logic is that if you have 3 origin such that you have:

QC - Origin 1
QC - Origin 2
Open - Origin 3

Then this is not really QC stuck because the open one should go to QC and then you have enough replicas to closed it, as we have all 3 origins.

The mis-matched-replicas-handler will issue the close command, but it also returns false always to let other handler run.

I think then, if we do nothing the normal under replication handling would take care of under replication in this scenario, ie only have 2 replicas.

If the OPEN replica never closes (don't think I have ever seen that occur in practice) I am not sure what we should do!

Okay, that makes sense. I'll continue reviewing and thinking about this in the context of the whole PR.

siddhantsangwan

Suppose there are two unique origins, and this new logic has made it such that there are two replicas of each origin. So in total there are 4 replicas. Will QuasiClosedStuckReplicationCheck return false in this case, and then the regular replication check handler could end up queuing this container as over replicated?

sodonnel · 2025-03-10T14:02:32Z

Suppose there are two unique origins, and this new logic has made it such that there are two replicas of each origin. So in total there are 4 replicas. Will QuasiClosedStuckReplicationCheck return false in this case, and then the regular replication check handler could end up queuing this container as over replicated?

That might be possible. I probably need to add a clause to the RatisReplicationCheckHandler to not run if shouldHandleAsQCStuck is true. I must not have any scenario tests for that, as if its a problem they should catch it.

...rg/apache/hadoop/hdds/scm/container/replication/QuasiClosedStuckUnderReplicationHandler.java

…id under / over rep loop

sodonnel · 2025-03-10T16:24:15Z

uppose there are two unique origins, and this new logic has made it such that there are two replicas of each origin. So in total there are 4 replicas. Will QuasiClosedStuckReplicationCheck return false in this case, and then the regular replication check handler could end up queuing this container as over replicated?

You were correct - I have added some scenario tests and a fix that avoid this under / over replication loop.

siddhantsangwan · 2025-03-11T05:25:38Z

...rg/apache/hadoop/hdds/scm/container/replication/QuasiClosedStuckUnderReplicationHandler.java

+          mutablePendingOps.add(ContainerReplicaOp.create(ContainerReplicaOp.PendingOpType.ADD, target, 0));
+          totalCommandsSent++;
+        } catch (CommandTargetOverloadedException e) {
+          LOG.warn("Cannot replicate container {} because target {} is overloaded.", containerInfo, target);


It's the sources that are overloaded here, as the push replicate commands are being sent to them. So instead of target we need to log sourceDatanodes.

Ah, well spotted. I keep getting source and target confused! I will fix this.

siddhantsangwan · 2025-03-11T07:15:11Z

.../java/org/apache/hadoop/hdds/scm/container/replication/TestQuasiClosedStuckReplicaCount.java

+        ContainerID.valueOf(1), QUASI_CLOSED,
+        Pair.of(origin1, IN_SERVICE), Pair.of(origin1, IN_SERVICE), Pair.of(origin1, IN_MAINTENANCE),


Shouldn't this count as over replication? Min healthy for maintenance is one. So for origin1, one in-service replica and one maintenance replica should be sufficient.

.../java/org/apache/hadoop/hdds/scm/container/replication/TestQuasiClosedStuckReplicaCount.java

siddhantsangwan

@sodonnel Looks good other than the comments I left. We can decide whether we want to handle mis-replication when only one origin is present.

It'd also be good to have some test cases for multiple origins + under replication in TestQuasiClosedStuckReplicationCheck.

sodonnel · 2025-03-11T12:33:31Z

It'd also be good to have some test cases for multiple origins + under replication in TestQuasiClosedStuckReplicationCheck.

I have added only basic tests to the replication check test for the edge cases, then I have tried to cover all scenarios in the scenario based tests. If we do them in both places, then it duplicates the tests, and the scenario ones are a lot more readable I think.

…ep / over rep flow

S O'Donnell added 8 commits March 5, 2025 17:09

Add quasi-closed-stuck replication check handler and tests

7f12f4f

Add QC Stuck handler to the chain and fix tests

a1f0501

Correctly identify QC stuck containers we need to process

79860ef

Extend QuasiClosedStuckReplicaCount to identify under-rep origins and…

9e672e2

… copies needed

Add new under replication handler for QC Stuck

fcd1b9c

Replace now failing vulnerabled test with a scenario test

130354d

Replace further vulnerable unhealthy test with a scenario test

b674df3

Remove metrics counter check as it is shared between test runs

1f515e1

sodonnel requested a review from siddhantsangwan March 5, 2025 17:47

S O'Donnell added 3 commits March 5, 2025 17:49

Fix style

b45fd1e

Fix findbugs

a94ca84

Extend QCReplicaCount to find over replicated origins

ff46000

siddhantsangwan reviewed Mar 7, 2025

View reviewed changes

siddhantsangwan reviewed Mar 10, 2025

View reviewed changes

...rg/apache/hadoop/hdds/scm/container/replication/QuasiClosedStuckUnderReplicationHandler.java Show resolved Hide resolved

nandakumar131 self-requested a review March 10, 2025 15:11

Prevent Ratis Rep Check Handler running for QCStuck containers to avo…

d9c2521

…id under / over rep loop

siddhantsangwan reviewed Mar 11, 2025

View reviewed changes

.../java/org/apache/hadoop/hdds/scm/container/replication/TestQuasiClosedStuckReplicaCount.java Show resolved Hide resolved

siddhantsangwan reviewed Mar 11, 2025

View reviewed changes

Fix overloaded log message

a593be9

Allow the single origin case to be handled by the traditional under r…

3f6d286

…ep / over rep flow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-12483. Quasi Closed Stuck should have 2 replicas of each origin #8014

HDDS-12483. Quasi Closed Stuck should have 2 replicas of each origin #8014

sodonnel commented Mar 5, 2025

siddhantsangwan Mar 7, 2025

sodonnel Mar 7, 2025

siddhantsangwan Mar 10, 2025

siddhantsangwan left a comment

sodonnel commented Mar 10, 2025

sodonnel commented Mar 10, 2025

siddhantsangwan Mar 11, 2025

sodonnel Mar 11, 2025

siddhantsangwan Mar 11, 2025

siddhantsangwan left a comment

sodonnel commented Mar 11, 2025

		ContainerID.valueOf(1), QUASI_CLOSED,
		Pair.of(origin1, IN_SERVICE), Pair.of(origin1, IN_SERVICE), Pair.of(origin1, IN_MAINTENANCE),

HDDS-12483. Quasi Closed Stuck should have 2 replicas of each origin #8014

Are you sure you want to change the base?

HDDS-12483. Quasi Closed Stuck should have 2 replicas of each origin #8014

Conversation

sodonnel commented Mar 5, 2025

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

siddhantsangwan Mar 7, 2025

Choose a reason for hiding this comment

sodonnel Mar 7, 2025

Choose a reason for hiding this comment

siddhantsangwan Mar 10, 2025

Choose a reason for hiding this comment

siddhantsangwan left a comment

Choose a reason for hiding this comment

sodonnel commented Mar 10, 2025

sodonnel commented Mar 10, 2025

siddhantsangwan Mar 11, 2025

Choose a reason for hiding this comment

sodonnel Mar 11, 2025

Choose a reason for hiding this comment

siddhantsangwan Mar 11, 2025

Choose a reason for hiding this comment

siddhantsangwan left a comment

Choose a reason for hiding this comment

sodonnel commented Mar 11, 2025