-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource optimized placement strategy #8815
Resource optimized placement strategy #8815
Conversation
… make it a global acting strategy
… make it a global acting strategy
…ehluli/orleans into resource-based-strategy
…load shedding mechanism
src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs
Outdated
Show resolved
Hide resolved
src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs
Show resolved
Hide resolved
src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs
Outdated
Show resolved
Hide resolved
@ReubenBond Here are some benchmarks for moded filter (passing
|
If we end up only using |
Wouldn't dynamic PGO take care of de-virtualizing them once it "warms up" |
I don't know, but we can check. Do you have that benchmark code somewhere? |
Yeah here:
|
Generic is 30% slower than non-generic on my machine. The alloc is the filter instance itself. Unsure about the code size, but my guess would be that non-generic inlines more. |
@ReubenBond non-generic is a bit faster
|
I think we should go with non-generic for now |
Posted at the same time 😂 |
Agree, even for a RAM of 256GB, thats 2.56e+11 which is still well within the range of float's 3.4e+38 |
src/Orleans.Runtime/Placement/ResourceOptimizedPlacementDirector.cs
Outdated
Show resolved
Hide resolved
src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs
Outdated
Show resolved
Hide resolved
src/Orleans.Runtime/Placement/ResourceOptimizedPlacementDirector.cs
Outdated
Show resolved
Hide resolved
src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs
Outdated
Show resolved
Hide resolved
…ke it easier for the users to understand + add comments to explain that weights are relative to each other + modified the director to take into account potential totalWeight = 0 + removed config exception throwing if sum = 0; as the score will be 0 but due to the jitter it will act as it were RandomPlacement
@ReubenBond I've made some small little fixes, added some comments, and switched options to take The weights don't strictly need to have a hard upper limit (currently 100) due to normalization, but I believe its better to place a boundary for the sake of sanity. This is debatable ofc! Other than the above "issue", I don't see any further things we need to do, please let me know if you have something else in mind, otherwise this LGTM and is ready for merging. |
Update:
It would be expected that 'float?' has a size of 5 = 4 (float) + 1 (hasValue), but the alignment of the type is the size of its largest field.
This will help increase the number of |
This PR adds support for resource optimized placement strategy.
ResourceOptimizedPlacement
is a placement strategy which attempts to optimize resource distribution across the cluster.It assigns weights to runtime statistics to prioritize different resources and calculates a normalized score for each silo.
The silo with the lowest score is chosen for placing the activation. Normalization ensures that each property contributes proportionally to the overall score. Users can adjust the weights based on their specific requirements and priorities for load balancing.
In addition to normalization, an online adaptive algorithm provides a smoothing effect (filters out high frequency components) and avoids rapid signal drops by transforming it into a polynomial alike decay process. This contributes to avoiding resource saturation on the silos and especially newly joined silos.
Silos which are overloaded by definition of the load shedding mechanism are not considered as candidates for new placements.
When the local silo's score is within the
preference margin
of another remote silo, than the local silo is picked as the target.Since there could be more than 1 silo that has the same exact score, we pick 1 of them randomly so that we don't continuously pick the first one, out of the shorted-listed once.
This strategy is 'static' because it will do the best possible placement considering the current view of the whole cluster, as we know this view may be change dramatically even if no new placement is requested, because of various business logic of the users code.
A 'dynamic' resource optimization may be attempted to rebalance the silos, but this is out of the scope of this PR as it is:
Microsoft Reviewers: Open in CodeFlow