-
Notifications
You must be signed in to change notification settings - Fork 170
[#845] feat(storage): Introduce available space based storage choosing policy #847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #847 +/- ##
============================================
+ Coverage 56.99% 59.38% +2.38%
- Complexity 2137 2169 +32
============================================
Files 321 305 -16
Lines 15642 13358 -2284
Branches 1243 1248 +5
============================================
- Hits 8915 7932 -983
+ Misses 6220 4993 -1227
+ Partials 507 433 -74
... and 26 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
|
PTAL @jerqi Round-robin could be supported in the future. |
|
If you have time, could you help take a look ? |
|
Do you consider reading data if we use another storage choose policy? |
The partition->disk mapping has been supported in #424. |
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.uniffle.server.storage.local; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have a better package name? The local have a different style from other packages. Maybe we could a better shuffle server package organization like coordinator. @smallzhongfeng Do you have some suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, maybe we should reconstruct the package name of the shuffle server.
| import org.apache.uniffle.server.ShuffleDataFlushEvent; | ||
| import org.apache.uniffle.storage.common.Storage; | ||
|
|
||
| public interface StorageChoosingPolicy<T extends Storage> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between policy and strategy? Could we have a unify style?
|
|
||
| public interface StorageChoosingPolicy<T extends Storage> { | ||
|
|
||
| T choose(ShuffleDataFlushEvent event, T... candidates); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have better name? What's the difference between choose and select? Should we have a unify style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we use array as parameter? Collection seems better than array according to the book <>.
| public class ClassUtils { | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| public static <T> T instantiate(Class<T> clazz, Pair<Class<T>, Object>... typeAndVals) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seems have many similar places like the some strategies class. Could we unify them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| public interface StorageChoosingPolicy<T extends Storage> { | ||
|
|
||
| T choose(ShuffleDataFlushEvent event, T... candidates); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For strategy, we should add some Java docs to explain the function of this interface and tell other developer how to extend this interface.
| |rss.server.multistorage.manager.selector.class | org.apache.uniffle.server.storage.multi.DefaultStorageManagerSelector | The manager selector strategy for `MEMORY_LOCALFILE_HDFS`. Default value is `DefaultStorageManagerSelector`, and another `HugePartitionSensitiveStorageManagerSelector` will flush only huge partition's data to cold storage. | | ||
| |rss.server.localstorage.storage.choosing.policy.class|org.apache.uniffle.server.storage.local.HashStorageChoosingPolicy|For localstorage, the storage choosing policy is for per-partition. Default value is the hash-based disk selector.| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two configurations seems related. Do you think it's possible to merge these configurations in next release, so users don't have to be confused by too many configurations.
| final List<LocalStorage> candidates = Arrays.stream(storages) | ||
| .filter(x -> x.canWrite() && !x.isCorrupted()) | ||
| .collect(Collectors.toList()); | ||
|
|
||
| if (candidates.size() == 0) { | ||
| return null; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this method should be moved to interface's default method, such as getDefaultCandidates?
| return s1UsedRatio.compareTo(s2UsedRatio); | ||
| }); | ||
|
|
||
| return candidates.get(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max rather than sort then get first?
What changes were proposed in this pull request?
Introduce pluggable storage chooser strategies for local storage,
including hash-based(default) and capacity-based
Why are the changes needed?
Currently, the local storage of per-partition is selected by hash strategy.
This is simple but will cause many problems, like many huge partitions
will store in the same local disk, if the hash number is same.
I know this is rare but it has happened in our internal env.
So it's better to support available space based storage selector
for per-partition. And I will introduce the pluggable local storage selector,
but I'm not sure whether it could support multiple disk storages for one-partition.
Does this PR introduce any user-facing change?
Yes. Doc will be added later.
How was this patch tested?