HDFS-17599. Fix the mismatch between locations and indices for mover #6979
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of PR
JIRA: HDFS-16557.
We set the EC policy to (6+3) and also have nodes that were in state
ENTERING_MAINTENANCE.When we move the data of some directories from SSD to HDD, some blocks move fail due to disk full, as shown in the figure below (blk_-9223372033441574269).
We tried to move again and found the following error "
Replica does not exist".Observing the information of fsck, it can be found that the wrong blockid(blk_-9223372033441574270) was found when moving block.
Mover Logs:

FSCK Info:

Root Cause:
Similar to this HDFS-16333, when mover is initialized, only the
LIVEnode is processed. As a result, the datanode in theENTERING_MAINTENANCEstate in the locations is filtered when initializingDBlockStriped, but the indices are not adapted, resulting in a mismatch between the location and indices lengths. Finally, ec block calculates the wrong blockid when getting internal block (seeDBlockStriped#getInternalBlock).Solution:
When initializing
DBlockStriped, if any location is filtered out, we need to remove the corresponding element in the indices to do the adaptation.How was this patch tested?
Pass the unit test.