feat: Improve slive stress tests with path scanning and better file h… #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Improve slive stress tests with path scanning and better file handling
Problem Statement
The slive stress test framework currently has several issues that impact test reliability and usability:
High Failure Rate: All operations (create, read, rename, delete) use random path generation, leading to numerous
FAILURESandNOT_FOUNDerrors when targeting non-existent files.Random Algorithm Issues: Every operation in hadoop 3.5.0 slive test source code uses random algorithm to generate paths, causing high
FAILURESandNOT_FOUNDerrorsRuntime Conflict: Directly replacing the JAR file to apply patches conflicts with other Hadoop test tools such as
TestDFSIOandNNBench, making it impractical for production use.Commented Features: In Hadoop 3.5.0 slive test source code, critical operation phase configurations (
beg,mid,end) were commented out, preventing proper test execution control and phase-based operation scheduling.Unclear Logging: Some log messages lack clarity, making it difficult to filter and diagnose issues during test execution.
Concurrent Operation Issues: Random path generation causes significant failures when multiple mappers operate on the same directory concurrently, reducing test accuracy.
Solution Overview
This PR introduces a backward-compatible improvement to the slive test framework with the following changes:
1. Configurable Algorithm Selection
USE_NEW_ALGORITHMconfiguration option to enable/disable the new path selection algorithmsort -hapache/hadoop#3: No need to replace JAR file, can use alongside other test tools2. Smart Path Selection for Operations
FileAlreadyExistsExceptionin rare cases, but this exception is now properly handledNOT_FOUNDerrors3. Improved Exception Handling
FileAlreadyExistsExceptionand generalIOExceptionin CREATE operationsfalsereturn asNOT_FOUNDinstead ofFAILURES)4. Enhanced Logging
5. Code Quality Improvements
existingFilesList,existingDirsList)List<Path>for cleaner code6. Restore Commented Features
beg,mid,end)Technical Details
Modified Files (14 files)
FileAlreadyExistsExceptionimport, improved error handlingUSE_NEW_ALGORITHMsupportTesting
Benefits
NOT_FOUNDandFAILURESby significant marginsUsage
Enable the new algorithm:
Run standard tests (existing behavior):
# Default uses original algorithmRelated Issues