Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParallelUnbalancedWork for efficient unbalanced parallel loops #7787

Merged
merged 14 commits into from
Dec 2, 2024

Conversation

benaadams
Copy link
Member

@benaadams benaadams commented Nov 21, 2024

Changes

  • Added the ParallelUnbalancedWork class to efficiently execute parallel loops, handling unbalanced workloads.
  • Implemented static For methods to support parallel execution with and without thread-local data, including initialization and finalization functions.
  • Utilized thread pooling and a shared counter SharedCounter to distribute iterations among threads dynamically.
  • Aimed to optimize performance in scenarios where the workload per iteration is uneven, ensuring better resource utilization and reduced execution time; rather than the main thread .Waiting for background threads to complete
  • Less allocations than Parallel.For
Method Mean Error StdDev Ratio Allocated Alloc Ratio
ParallelFor 543.9 ms 12.13 ms 18.15 ms 1.00 19.21 KB 1.00
ParallelForEach 488.5 ms 12.73 ms 17.43 ms 0.90 23.12 KB 1.20
UnbalancedParallel 413.4 ms 1.15 ms 1.73 ms 0.76 5.00 KB 0.26

Types of changes

What types of changes does your code introduce?

  • Optimization

Testing

Requires testing

  • No

Copy link
Member

@LukaszRozmej LukaszRozmej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide some kind of benchmark?

- Introduced the `ParallelUnbalancedWork` class to efficiently execute parallel loops over a range of integers, handling unbalanced workloads.
- Added static `For` methods to support parallel execution with and without thread-local data, including initialization and finalization functions.
- Utilized thread pooling and a shared counter (`SharedCounter`) to distribute iterations among threads dynamically.
- Implemented internal classes (`BaseData`, `Data`, and `InitProcessor<TLocal>`) to manage shared state and thread synchronization.
- Aimed to optimize performance in scenarios where the workload per iteration is uneven, ensuring better resource utilization and reduced execution time.
@benaadams benaadams force-pushed the parallel-unbalanced-work branch from 0072faa to 061956a Compare November 28, 2024 16:52
@benaadams benaadams mentioned this pull request Dec 2, 2024
3 tasks
@benaadams
Copy link
Member Author

Method Mean Error StdDev Ratio Allocated Alloc Ratio
ParallelFor 543.9 ms 12.13 ms 18.15 ms 1.00 19.21 KB 1.00
ParallelForEach 488.5 ms 12.73 ms 17.43 ms 0.90 23.12 KB 1.20
UnbalancedParallel 413.4 ms 1.15 ms 1.73 ms 0.76 5.00 KB 0.26

Copy link
Member

@LukaszRozmej LukaszRozmej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling ParallelUnbalancedWork deserves either to be a PR to dotnet or at least its own nuget package.

{
IReadOnlyTxProcessorSource env = _envPool.Get();
int i = 0;
IReadOnlyTxProcessorSource env = state.preWarmer._envPool.Get();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super minor: maybe better to pass the pool directly rather than whole prewarmer as it is not needed?
And maybe build custom struct for it as those are used in 2 places?

Comment on lines +193 to +199
public int ActiveThreads => Volatile.Read(ref _activeThreads);

/// <summary>
/// Marks a thread as completed.
/// </summary>
/// <returns>The number of remaining active threads.</returns>
public int MarkThreadCompleted()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: in theory this could be misused by calling MarkThreadCompleted first, but as this is private class we can ignore it.

@benaadams benaadams merged commit 4544a6c into master Dec 2, 2024
79 checks passed
@benaadams benaadams deleted the parallel-unbalanced-work branch December 2, 2024 19:34
@TeddyAlbina
Copy link

I have a feeling ParallelUnbalancedWork deserves either to be a PR to dotnet or at least its own nuget package.

I agree this should be in dotnet itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants