HDDS-10370. Recon - Handle the pre-existing missing empty containers in clusters. #6255

devmadhuu · 2024-02-22T13:56:02Z

What changes were proposed in this pull request?

This PR addresses the corner case when a Recon has earlier identified some MISSING (UNHEALTHY) containers, but they all were EMPTY (No keys mapped), so Recon should filter out such existing MISSING (UNHEALTHY) containers.

In a running cluster, Recon may have identified some MISSING (UNHEALTHY) containers before the HDDS-9695 was applied. After the fix is applied, in a running cluster, existing MISSING (UNHEALTHY) containers which were actually EMPTY were still shown as MISSING, so this PR change is to filter out such existing MISSING (UNHEALTHY) containers which were actually EMPTY.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10370

How was this patch tested?

Existing Junit integration test org.apache.hadoop.ozone.recon.TestReconTasks#testEmptyMissingContainerDownNode was updated and tested.

…in clusters.

devmadhuu · 2024-02-22T14:52:43Z

@sumitagrawl @dombizita @fapifta , pls review.

devmadhuu · 2024-02-22T15:20:29Z

Working on fixing TestContainerHealthTask test case failure.

devmadhuu · 2024-02-26T08:58:40Z

Working on fixing TestContainerHealthTask test case failure.

This is fixed.

sodonnel · 2024-02-26T15:14:06Z

...on/src/main/java/org/apache/hadoop/ozone/recon/persistence/ContainerHealthSchemaManager.java

@@ -113,6 +118,10 @@ public Cursor<UnhealthyContainersRecord> getAllUnhealthyRecordsCursor() {
  }

  public void insertUnhealthyContainerRecords(List<UnhealthyContainers> recs) {
+    recs.forEach(rec -> {


Should we wrap this in a "if debug enabled" statement? Normally I am against doing that for a single debug statement, as it is handled inside the logger, but in this case we will have to iterate all the records and it is pssible there are a lot of them, so it might be worth wrapping the entire forEach so avoid that expense if it is not needed.

Ok, good suggestion @sodonnel . I have handled it.

sodonnel · 2024-02-26T15:16:54Z

...op-ozone/recon/src/test/java/org/apache/hadoop/ozone/recon/fsck/TestContainerHealthTask.java

@@ -110,38 +112,86 @@ public void testRun() throws Exception {
      when(scmClientMock.getContainerWithPipeline(c.getContainerID()))
          .thenReturn(new ContainerWithPipeline(c, null));
    }
+
+    ReplicatedReplicationConfig replicationConfig = new ReplicatedReplicationConfig() {


Do we need all this? Can we not just do:

ReplicatedReplicationConfig replicationConfig = RatisReplicationConfig.getInstance(THREE);

Do we need all this? Can we not just do:

ReplicatedReplicationConfig replicationConfig = RatisReplicationConfig.getInstance(THREE);

Yes, its not needed. I added earlier due to a error. Now simplified. Thanks.

sumitagrawl

LGTM

sodonnel

LGTM

…in clusters. (apache#6255) (cherry picked from commit e0bf7b4) Change-Id: I72150151e39f2bfc29d9f39673782c1afd5353ad

deveshsingh added 2 commits February 22, 2024 19:17

HDDS-10370. Recon - Handle the pre-existing missing empty containers …

e4a975c

…in clusters.

HDDS-10370. Recon - Handle the pre-existing missing empty containers …

8492d56

…in clusters.

adoroszlai added the recon label Feb 22, 2024

devmadhuu marked this pull request as ready for review February 22, 2024 14:52

devmadhuu marked this pull request as draft February 22, 2024 15:20

HDDS-10370. Fixed TestContainerHealthTask test case failure.

e1f99c1

devmadhuu marked this pull request as ready for review February 23, 2024 14:23

devmadhuu requested a review from sodonnel February 26, 2024 08:58

sodonnel reviewed Feb 26, 2024

View reviewed changes

HDDS-10370. Handled review comments.

8998510

sumitagrawl approved these changes Feb 27, 2024

View reviewed changes

sodonnel approved these changes Feb 27, 2024

View reviewed changes

sumitagrawl merged commit e0bf7b4 into apache:master Feb 28, 2024
35 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-10370. Recon - Handle the pre-existing missing empty containers in clusters. #6255

HDDS-10370. Recon - Handle the pre-existing missing empty containers in clusters. #6255

devmadhuu commented Feb 22, 2024

devmadhuu commented Feb 22, 2024

devmadhuu commented Feb 22, 2024

devmadhuu commented Feb 26, 2024

sodonnel Feb 26, 2024

devmadhuu Feb 27, 2024

sodonnel Feb 26, 2024

devmadhuu Feb 27, 2024

sumitagrawl left a comment

sodonnel left a comment

HDDS-10370. Recon - Handle the pre-existing missing empty containers in clusters. #6255

HDDS-10370. Recon - Handle the pre-existing missing empty containers in clusters. #6255

Conversation

devmadhuu commented Feb 22, 2024

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

devmadhuu commented Feb 22, 2024

devmadhuu commented Feb 22, 2024

devmadhuu commented Feb 26, 2024

sodonnel Feb 26, 2024

Choose a reason for hiding this comment

devmadhuu Feb 27, 2024

Choose a reason for hiding this comment

sodonnel Feb 26, 2024

Choose a reason for hiding this comment

devmadhuu Feb 27, 2024

Choose a reason for hiding this comment

sumitagrawl left a comment

Choose a reason for hiding this comment

sodonnel left a comment

Choose a reason for hiding this comment