Skip to content

Commit

Permalink
TBB DOC : Dev Guide: Task Scheduler Bypass and How Does Task Scheduler
Browse files Browse the repository at this point in the history
 Works

- Merged "How Scheduler Works" and "Scheduler Summary"
  • Loading branch information
anton-potapov committed Mar 10, 2022
1 parent cd53d5a commit 38957a6
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 17 deletions.
34 changes: 20 additions & 14 deletions doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,27 @@
How Does Task Scheduler Works
=============================

The scheduler runs tasks in a way that tends to minimize both memory
demands and cross-thread communication. The intuition is that a balance
must be reached between depth-first and breadth-first executions.
Assuming that the tree is finite, depth-first is better for sequential
execution for the following reasons:

While the task scheduler is not bound to any particular type of the parallelism,
it was designed to works efficiently for fork-join parallelism with lots of forks
(this type of parallelism is typical to parallel algorithms like parallel_for).

Lets consider mapping of fork-join parallelsm on the task scheduler in more details.

The scheduler runs tasks in a way that tries to achievel several targets simulteniously :
- utilize as more threads as possible, to acieve actual parallelism
- preserve data locality, to make single thread execution more efficient
- minimize both memory demands and cross-thread communication, to reduce overhead

To achive this a balance between depth-first and breadth-first execution strategies
must be reached. Assuming that the task graph is finite, depth-first is better for
sequential execution for the following reasons:

- **Strike when the cache is hot**. The deepest tasks are the most recently created tasks and therefore are the hottest in the cache.
The deepest tasks are the most recently created tasks and therefore are the hottest in the cache.
Also, if they can complete, then tasks depending on it can continue executing, and though not the hottest in cache,
they are still warmer than the older tasks above.
they are still warmer than the older tasks above.

- **Minimize space**. Executing the shallowest task leads to breadth-first unfolding of the tree. It creates an exponential
Executing the shallowest task leads to breadth-first unfolding of the tree. It creates an exponential
- **Minimize space**. Executing the shallowest task leads to breadth-first unfolding of the graph. It creates an exponential
number of nodes that co-exist simultaneously. In contrast, depth-first execution creates the same number
of nodes, but only a linear number can exist at the same time, because it creates a stack of other ready
tasks.
Expand All @@ -24,13 +32,11 @@ Each thread has its own deque[8] of tasks that are ready to run. When a
thread spawns a task, it pushes it onto the bottom of its deque.

When a thread participates in the evaluation of tasks, it constantly executes
a task obtained by the first rule below that applies:
a task obtained by the first rule that applies from the roughly equvalent ruleset below:

- Get the task returned by the previous one. This rule does not apply
if the task does not return anything.
- Get the task returned by the previous one, if any.

- Take a task from the bottom of its own deque. This rule does not apply
if the deque is empty.
- Take a task from the bottom of its own deque, if any.

- Steal a task from the top of another randomly chosen deque. If the
selected deque is empty, the thread tries again to execute this rule until it succeeds.
Expand Down
6 changes: 4 additions & 2 deletions doc/main/tbb_userguide/Task_Scheduler_Bypass.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Task Scheduler Bypass

Scheduler bypass is an optimization where you directly specify the next task to run.
According to the rules of execution in :doc:`How Task Scheduling Works <How_Does_Task_Scheduler_Works>`,
spawning of the new task causes the executing thread to do the following:
spawning of the new task to be executed by the current thread, involves these steps:

- Push a new task onto the thread's deque.
- Continue to execute the current task until it is completed.
Expand All @@ -15,4 +15,6 @@ Steps 1 and 3 introduce unnecessary deque operations or, even worse, allow steal
locality without adding significant parallelism. These problems can be avoided by returning next task to execute
instead of spawning it. When using the method shown in :doc:`How Task Scheduling Works <How_Does_Task_Scheduler_Works>`,
the returned task becomes the next task executed by the thread. Furthermore, this approach almost guarantees that
the task will be executed by the current thread, and not by any other thread.
the task will be executed by the current thread, and not by any other thread.

Please note that at the moment the only way to use this optimization is to use oreview feature of ``onepai::tbb::task_group``
1 change: 0 additions & 1 deletion doc/main/tbb_userguide/The_Task_Scheduler.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,4 @@ onto one of the high-level templates, use the task scheduler.
../tbb_userguide/When_Task-Based_Programming_Is_Inappropriate
../tbb_userguide/How_Does_Task_Scheduler_Works
../tbb_userguide/Task_Scheduler_Bypass
../tbb_userguide/Task_Scheduler_Summary
../tbb_userguide/Guiding_Task_Scheduler_Execution

0 comments on commit 38957a6

Please sign in to comment.