-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TBB DOC : Dev Guide: Task Scheduler Bypass and How Does Task Scheduler Works #521
Merged
anton-potapov
merged 11 commits into
uxlfoundation:master
from
anton-potapov:task_group_extension_doc_scheduller_bypass
Mar 21, 2022
Merged
Changes from 10 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
6f96880
TBB DOC : Dev Guide: Task Scheduler Bypass and How Does Task Scheduler
anton-potapov 9365674
Fix grammar
anton-potapov cd53d5a
Fix grammar
anton-potapov 0555ace
TBB DOC : Dev Guide: Task Scheduler Bypass and How Does Task Scheduler
anton-potapov 3454546
Wording fixes
anton-potapov 4a68f02
Wording fixes
anton-potapov e32db60
Wording fixes
anton-potapov 2ac7302
Wording fixes
anton-potapov 38c3af5
Wording fixes
anton-potapov 8f9bbe6
Wordings and style fixes
anton-potapov a774291
Minor wording and ref fixes
anton-potapov File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
.. _How_Task_Scheduler_Works.rst: | ||
|
||
How Task Scheduler Works | ||
======================== | ||
|
||
|
||
While the task scheduler is not bound to any particular type of parallelism, | ||
it was designed to work efficiently for fork-join parallelism with lots of forks. | ||
This type of parallelism is typical for parallel algorithms such as `oneapi::tbb::parallel_for | ||
<https://spec.oneapi.io/versions/latest/elements/oneTBB/source/algorithms/functions/parallel_for_func.html>`_. | ||
|
||
Let's consider the mapping of fork-join parallelism on the task scheduler in more detail. | ||
|
||
The scheduler runs tasks in a way that tries to achieve several targets simultaneously: | ||
- utilize as more threads as possible, to achieve actual parallelism | ||
- Preserve data locality to make a single thread execution more efficient | ||
- Minimize both memory demands and cross-thread communication to reduce an overhead | ||
|
||
To achieve this, a balance between depth-first and breadth-first execution strategies | ||
must be reached. Assuming that the task graph is finite, depth-first is better for | ||
a sequential execution because: | ||
|
||
- **Strike when the cache is hot**. The deepest tasks are the most recently created tasks and therefore are the hottest in the cache. | ||
Also, if they can be completed, tasks that depend on it can continue executing, and though not the hottest in a cache, | ||
they are still warmer than the older tasks deeper in the dequeue. | ||
|
||
- **Minimize space**. Execution of the shallowest task leads to the breadth-first unfolding of a graph. It creates an exponential | ||
number of nodes that co-exist simultaneously. In contrast, depth-first execution creates the same number | ||
of nodes, but only a linear number can exists at the same time, since it creates a stack of other ready | ||
tasks. | ||
|
||
Each thread has its deque of tasks that are ready to run. When a | ||
thread spawns a task, it pushes it onto the bottom of its deque. | ||
|
||
When a thread participates in the evaluation of tasks, it constantly executes | ||
a task obtained by the first rule that applies from the roughly equivalent ruleset: | ||
|
||
- Get the task returned by the previous one, if any. | ||
|
||
- Take a task from the bottom of its deque, if any. | ||
|
||
- Steal a task from the top of another randomly chosen deque. If the | ||
selected deque is empty, the thread tries again to execute this rule until it succeeds. | ||
|
||
Rule 1 is described in :doc:`Task Scheduler Bypass <Task_Scheduler_Bypass>`. | ||
The overall effect of rule 2 is to execute the *youngest* task spawned by the thread, | ||
which causes the depth-first execution until the thread runs out of work. | ||
Then rule 3 applies. It steals the *oldest* task spawned by another thread, | ||
which causes temporary breadth-first execution that converts potential parallelism | ||
into actual parallelism. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
.. _Task_Scheduler_Bypass: | ||
|
||
Task Scheduler Bypass | ||
===================== | ||
|
||
Scheduler bypass is an optimization where you directly specify the next task to run. | ||
According to the rules of execution described in :doc:`How Task Scheduling Works <How_Does_Task_Scheduler_Works>`, | ||
the spawning of the new task to be executed by the current thread involves the next steps: | ||
|
||
- Push a new task onto the thread's deque. | ||
- Continue to execute the current task until it is completed. | ||
- Take a task from the thread's deque, unless it is stolen by another thread. | ||
|
||
Steps 1 and 3 introduce unnecessary deque operations or, even worse, allow stealing that can hurt | ||
aepanchi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
locality without adding significant parallelism. These problems can be avoided by using "Task Scheduler Bypass" technique to directly point the preferable task to be executed next | ||
instead of spawning it. When, as described in :doc:`How Task Scheduler Works <How_Task_Scheduler_Works>`, | ||
the returned task becomes the first candidate for the next task to be executed by the thread. Furthermore, this approach almost guarantees that | ||
the task is executed by the current thread and not by any other thread. | ||
|
||
Please note that at the moment the only way to use this optimization is to use `preview feature of ``onepai::tbb::task_group`` |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.