TBB DOC : Dev Guide: Task Scheduler Bypass and How Does Task Scheduler

Works - Merged "How Scheduler Works" and "Scheduler Summary"
uxlfoundation · Mar 10, 2022 · 38957a6 · 38957a6
1 parent cd53d5a
commit 38957a6
Show file tree

Hide file tree

Showing 3 changed files with 24 additions and 17 deletions.
diff --git a/doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst b/doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst
@@ -3,19 +3,27 @@
 How Does Task Scheduler Works
 =============================
 
-The scheduler runs tasks in a way that tends to minimize both memory 
-demands and cross-thread communication. The intuition is that a balance 
-must be reached between depth-first and breadth-first executions. 
-Assuming that the tree is finite, depth-first is better for sequential 
-execution for the following reasons:
+
+While the task scheduler is not bound to any particular type of the  parallelism, 
+it was designed to works efficiently for fork-join parallelism with lots of forks 
+(this type of parallelism is typical to parallel algorithms like parallel_for).
+
+Lets consider mapping of fork-join parallelsm on the task scheduler in more details. 
+
+The scheduler runs tasks in a way that tries to achievel several targets simulteniously : 
+ - utilize as more threads as possible, to acieve actual parallelism
+ - preserve data locality, to make single thread execution more efficient  
+ - minimize both memory demands and cross-thread communication, to reduce overhead 
+
+To achive this a balance between depth-first and breadth-first execution strategies 
+must be reached. Assuming that the task graph is finite, depth-first is better for 
+sequential execution for the following reasons:
 
 - **Strike when the cache is hot**. The deepest tasks are the most recently created tasks and therefore are the hottest in the cache.
-  The deepest tasks are the most recently created tasks and therefore are the hottest in the cache. 
   Also, if they can complete, then tasks depending on it can continue executing, and though not the hottest in cache, 
-they are still warmer than the older tasks above.
+  they are still warmer than the older tasks above.
 
-- **Minimize space**. Executing the shallowest task leads to breadth-first unfolding of the tree. It creates an exponential
-  Executing the shallowest task leads to breadth-first unfolding of the tree. It creates an exponential
+- **Minimize space**. Executing the shallowest task leads to breadth-first unfolding of the graph. It creates an exponential
   number of nodes that co-exist simultaneously. In contrast, depth-first execution creates the same number 
   of nodes, but only a linear number can exist at the same time, because it creates a stack of other ready 
   tasks.
@@ -24,13 +32,11 @@ Each thread has its own deque[8] of tasks that are ready to run. When a
 thread spawns a task, it pushes it onto the bottom of its deque.
 
 When a thread participates in the evaluation of tasks, it constantly executes 
-a task obtained by the first rule below that applies:
+a task obtained by the first rule that applies from the roughly equvalent ruleset below:
 
-- Get the task returned by the previous one. This rule does not apply 
-  if the task does not return anything.
+- Get the task returned by the previous one, if any.
 
-- Take a task from the bottom of its own deque. This rule does not apply 
-  if the deque is empty.
+- Take a task from the bottom of its own deque, if any.
 
 - Steal a task from the top of another randomly chosen deque. If the 
   selected deque is empty, the thread tries again to execute this rule until it succeeds.

diff --git a/doc/main/tbb_userguide/Task_Scheduler_Bypass.rst b/doc/main/tbb_userguide/Task_Scheduler_Bypass.rst
@@ -5,7 +5,7 @@ Task Scheduler Bypass
 
 Scheduler bypass is an optimization where you directly specify the next task to run. 
 According to the rules of execution in  :doc:`How Task Scheduling Works <How_Does_Task_Scheduler_Works>`, 
-spawning of the new task causes the executing thread to do the following:
+spawning of the new task to be executed by the current thread, involves these steps:
 
  -  Push a new task onto the thread's deque.
  -  Continue to execute the current task until it is completed.
@@ -15,4 +15,6 @@ Steps 1 and 3 introduce unnecessary deque operations or, even worse, allow steal
 locality without adding significant parallelism. These problems can be avoided by returning next task to execute 
 instead of spawning it. When using the method shown in :doc:`How Task Scheduling Works <How_Does_Task_Scheduler_Works>`,
 the returned task becomes the next task executed by the thread. Furthermore, this approach almost guarantees that 
-the task will be executed by the current thread, and not by any other thread.
+the task will be executed by the current thread, and not by any other thread.
+
+Please note that at the moment the only way to use this optimization is to use oreview feature of ``onepai::tbb::task_group`` 
diff --git a/doc/main/tbb_userguide/The_Task_Scheduler.rst b/doc/main/tbb_userguide/The_Task_Scheduler.rst
@@ -18,5 +18,4 @@ onto one of the high-level templates, use the task scheduler.
    ../tbb_userguide/When_Task-Based_Programming_Is_Inappropriate
    ../tbb_userguide/How_Does_Task_Scheduler_Works
    ../tbb_userguide/Task_Scheduler_Bypass
-   ../tbb_userguide/Task_Scheduler_Summary
    ../tbb_userguide/Guiding_Task_Scheduler_Execution