Skip to content

Conversation

@zuston
Copy link
Member

@zuston zuston commented Apr 27, 2023

What changes were proposed in this pull request?

Introduce pluggable storage chooser strategies for local storage,
including hash-based(default) and capacity-based

Why are the changes needed?

Currently, the local storage of per-partition is selected by hash strategy.
This is simple but will cause many problems, like many huge partitions
will store in the same local disk, if the hash number is same.
I know this is rare but it has happened in our internal env.

So it's better to support available space based storage selector
for per-partition. And I will introduce the pluggable local storage selector,
but I'm not sure whether it could support multiple disk storages for one-partition.

Does this PR introduce any user-facing change?

Yes. Doc will be added later.

How was this patch tested?

  1. UTs

@codecov-commenter
Copy link

codecov-commenter commented Apr 27, 2023

Codecov Report

Merging #847 (8ff6e6c) into master (4ce4a8d) will increase coverage by 2.38%.
The diff coverage is 70.00%.

❗ Current head 8ff6e6c differs from pull request most recent head 270e5d6. Consider uploading reports for the commit 270e5d6 to get more accurate results

@@             Coverage Diff              @@
##             master     #847      +/-   ##
============================================
+ Coverage     56.99%   59.38%   +2.38%     
- Complexity     2137     2169      +32     
============================================
  Files           321      305      -16     
  Lines         15642    13358    -2284     
  Branches       1243     1248       +5     
============================================
- Hits           8915     7932     -983     
+ Misses         6220     4993    -1227     
+ Partials        507      433      -74     
Impacted Files Coverage Δ
...ava/org/apache/uniffle/common/util/ClassUtils.java 0.00% <0.00%> (ø)
...rg/apache/uniffle/storage/common/LocalStorage.java 61.36% <0.00%> (+0.59%) ⬆️
...he/uniffle/server/storage/LocalStorageManager.java 84.97% <70.00%> (-1.00%) ⬇️
.../server/storage/local/HashBasedStorageChooser.java 92.30% <92.30%> (ø)
...a/org/apache/uniffle/server/ShuffleServerConf.java 99.43% <100.00%> (+<0.01%) ⬆️
...ver/storage/local/CapacityBasedStorageChooser.java 100.00% <100.00%> (ø)

... and 26 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@zuston zuston marked this pull request as draft May 4, 2023 03:28
@zuston zuston changed the title [#845] feat(storage): Introduce capacity based storage chooser strategy [#845] feat(storage): Introduce available space based storage choosing policy May 4, 2023
@zuston zuston marked this pull request as ready for review May 4, 2023 07:28
@zuston
Copy link
Member Author

zuston commented May 4, 2023

PTAL @jerqi Round-robin could be supported in the future.

@zuston
Copy link
Member Author

zuston commented May 6, 2023

If you have time, could you help take a look ?

@jerqi
Copy link
Contributor

jerqi commented May 6, 2023

Do you consider reading data if we use another storage choose policy?

@zuston
Copy link
Member Author

zuston commented May 7, 2023

Do you consider reading data if we use another storage choose policy?

The partition->disk mapping has been supported in #424.

* limitations under the License.
*/

package org.apache.uniffle.server.storage.local;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a better package name? The local have a different style from other packages. Maybe we could a better shuffle server package organization like coordinator. @smallzhongfeng Do you have some suggestion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maybe we should reconstruct the package name of the shuffle server.

import org.apache.uniffle.server.ShuffleDataFlushEvent;
import org.apache.uniffle.storage.common.Storage;

public interface StorageChoosingPolicy<T extends Storage> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between policy and strategy? Could we have a unify style?


public interface StorageChoosingPolicy<T extends Storage> {

T choose(ShuffleDataFlushEvent event, T... candidates);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have better name? What's the difference between choose and select? Should we have a unify style?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use array as parameter? Collection seems better than array according to the book <>.

public class ClassUtils {

@SuppressWarnings("unchecked")
public static <T> T instantiate(Class<T> clazz, Pair<Class<T>, Object>... typeAndVals)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seems have many similar places like the some strategies class. Could we unify them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


public interface StorageChoosingPolicy<T extends Storage> {

T choose(ShuffleDataFlushEvent event, T... candidates);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For strategy, we should add some Java docs to explain the function of this interface and tell other developer how to extend this interface.

Comment on lines 93 to +94
|rss.server.multistorage.manager.selector.class | org.apache.uniffle.server.storage.multi.DefaultStorageManagerSelector | The manager selector strategy for `MEMORY_LOCALFILE_HDFS`. Default value is `DefaultStorageManagerSelector`, and another `HugePartitionSensitiveStorageManagerSelector` will flush only huge partition's data to cold storage. |
|rss.server.localstorage.storage.choosing.policy.class|org.apache.uniffle.server.storage.local.HashStorageChoosingPolicy|For localstorage, the storage choosing policy is for per-partition. Default value is the hash-based disk selector.|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two configurations seems related. Do you think it's possible to merge these configurations in next release, so users don't have to be confused by too many configurations.

Comment on lines +32 to +38
final List<LocalStorage> candidates = Arrays.stream(storages)
.filter(x -> x.canWrite() && !x.isCorrupted())
.collect(Collectors.toList());

if (candidates.size() == 0) {
return null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this method should be moved to interface's default method, such as getDefaultCandidates?

return s1UsedRatio.compareTo(s2UsedRatio);
});

return candidates.get(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max rather than sort then get first?

@zuston zuston closed this Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants