-
Notifications
You must be signed in to change notification settings - Fork 168
Closed
Closed
Copy link
Description
Code of Conduct
- I agree to follow this project's Code of Conduct
Search before asking
- I have searched in the issues and found no similar issues.
Describe the bug
I specify the following conf.
# HDFS fallback strategy
rss.server.hybrid.storage.fallback.strategy.class org.apache.uniffle.server.storage.LocalStorageManagerFallbackStrategy
rss.server.hybrid.storage.manager.selector.class org.apache.uniffle.server.storage.hybrid.HugePartitionSensitiveStorageManagerSelector
For one shuffle-server, fallback to hadoop storage is invalid when local storage's all disks are in high watermark.
Because once all disks are in high-watermark, it will return null in selectStorage, and because the retry time default value is 0 in LocalStorageManagerFallbackStrategy. So it won't be fallbacked and then the event will be discarded in DefaultFlushEventHandler.
The logs are as follows:
I will fix this using the following 2 steps
- allow setting the negative value for
rss.server.hybrid.storage.fallback.max.fail.times - optimize
DefaultFlushEventHandlerwhen encountering null storage for one flush event.
Affects Version(s)
master
Uniffle Server Log Output
No response
Uniffle Engine Log Output
No response
Uniffle Server Configurations
No response
Uniffle Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Metadata
Metadata
Assignees
Labels
No labels
