-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TBB DOC : Dev Guide: Task Scheduler Bypass and How Does Task Scheduler Works #521
TBB DOC : Dev Guide: Task Scheduler Bypass and How Does Task Scheduler Works #521
Conversation
Works Signed-off-by: Anton Potapov <anton.potapov@intel.com>
============================= | ||
|
||
The scheduler runs tasks in a way that tends to minimize both memory | ||
demands and cross-thread communication. The intuition is that a balance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by intuition?
The scheduler runs tasks in a way that tends to minimize both memory | ||
demands and cross-thread communication. The intuition is that a balance | ||
must be reached between depth-first and breadth-first execution. | ||
Assuming that the tree is finite, depth-first is best for sequential | ||
execution for the following reasons: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be tree decomposition oriented. While TBB scheduler and tasking approach was (almost) tree structure oriented, the oneTBB scheduler is completely unaware of task dependencies and we need to think how to explain it better. Maybe we can first say about organization of queues and so on but then explain how it is mapped on tree decomposition that is widely used in parallel algorithms but not in flow graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
of nodes, but only a linear number can exist at the same time, because it creates a stack of other ready | ||
tasks. | ||
|
||
Each thread has its own deque[8] of tasks that are ready to run. When a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that reference to definition of deque
is missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean it is missing in our documentation or in this particular topic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[8]
means that reference was in the original documentation of "classic" TBB.
current documentation does not have this reference as well
So I have just put a note to not forget to add it.
Works - Merged "How Scheduler Works" and "Scheduler Summary"
38957a6
to
0555ace
Compare
of nodes, but only a linear number can exist at the same time, because it creates a stack of other ready | ||
tasks. | ||
|
||
Each thread has its own deque[8] of tasks that are ready to run. When a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean it is missing in our documentation or in this particular topic?
- Take a task from the thread's deque, unless it is stolen by another thread. | ||
|
||
Steps 1 and 3 introduce unnecessary deque operations or, even worse, allow stealing that can hurt | ||
locality without adding significant parallelism. These problems can be avoided by returning next task to execute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
locality without adding significant parallelism. These problems can be avoided by returning next task to execute | |
locality without adding significant parallelism. These problems can be avoided by returning the next task to execute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Steps 1 and 3 introduce unnecessary deque operations or, even worse, allow stealing that can hurt | ||
locality without adding significant parallelism. These problems can be avoided by returning next task to execute | ||
instead of spawning it. When using the method shown in :doc:`How Task Scheduling Works <How_Does_Task_Scheduler_Works>`, | ||
the returned task becomes the next task executed by the thread. Furthermore, this approach almost guarantees that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The task does not always become the the next task due to Flow Graph priorities. Maybe the returned task
is considered first to be executed by the thread`?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, thanks for idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chnged
|
||
While the task scheduler is not bound to any particular type of the parallelism, | ||
it was designed to works efficiently for fork-join parallelism with lots of forks | ||
(this type of parallelism is typical to parallel algorithms like parallel_for). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to convert plain text parallel_for
into reference
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
- Take a task from the thread's deque, unless it is stolen by another thread. | ||
|
||
Steps 1 and 3 introduce unnecessary deque operations or, even worse, allow stealing that can hurt | ||
locality without adding significant parallelism. These problems can be avoided by returning next task to execute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to mention "Bypass"
the returned task becomes the next task executed by the thread. Furthermore, this approach almost guarantees that | ||
the task is executed by the current thread and not by any other thread. | ||
|
||
Please note that at the moment the only way to use this optimization is to use oreview feature of ``onepai::tbb::task_group`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to add reference to the according topic
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Let's consider the mapping of fork-join parallelism on the task scheduler in more detail. | ||
|
||
The scheduler runs tasks in a way that tries to achieve several targets simultaneously: | ||
- utilize as more threads as possible, to achieve actual parallelism |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- utilize as more threads as possible, to achieve actual parallelism | |
- Utilize as more threads as possible, to achieve actual parallelism |
38f593b
to
a774291
Compare
…r Works (uxlfoundation#521) * TBB DOC : Dev Guide: Task Scheduler Bypass and How Task Scheduler Works Signed-off-by: Anton Potapov <anton.potapov@intel.com> Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com> (cherry picked from commit ed9d4b5)
* Update pull_request_template.md (#751) Signed-off-by: Alexandra Epanchinzeva alexandra.epanchinzeva@intel.com (cherry picked from commit 4eda0f9) * Update CONTRIBUTING.md (#765) (cherry picked from commit e274a9e) * Documentation update for unpreview `task_handle` and related stuff (#755) * Unpreview task_handle and related stuff Signed-off-by: Anton Potapov <anton.potapov@intel.com> Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com> (cherry picked from commit fd76f45) * Update conf.py (#774) (cherry picked from commit 6666292) * Actualize documentation about proportional splitting constructor in Range (#728) Actualize documentation about proportional splitting Signed-off-by: Fedotov, Aleksei <aleksei.fedotov@intel.com> (cherry picked from commit e5cbe50) * Update doc structure and add new files (#791) (cherry picked from commit ce0d258) * Instruction for building the docs locally (#778) (cherry picked from commit e386960) * Document a way to flow graph can be attached to arbitrary task_arena (#785) * Document a way to flow graph can be attached to arbitrary task_arena task_arena interface provides mechanisms to guide tasks execution within the arena by setting the preferred computation units or restricting part of computation units. In some cases, you may want to use mechanisms within a flow graph. Signed-off-by: Serov, Vladimir <vladimir.serov@intel.com> Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com> Co-authored-by: Alexandra Epanchinzeva <alexandra.epanchinzeva@intel.com> (cherry picked from commit a938322) * Add topic about "Lazy Initiliazation" pattern to Design patterns (#790) New topic about Lazy initialization pattern and how it can be implemented using oneapi::tbb::collaborative_call_once has been added. Signed-off-by: Ilya Isaev <ilya.isaev@intel.com> (cherry picked from commit 1da8f0d) * Update Get Started Guide (#803) (cherry picked from commit 0502372) * Update intro_gsg.rst (#808) (cherry picked from commit 2c4f282) * Update conf.py (#810) (cherry picked from commit 0a0a592) * TBB DOC : Dev Guide: Task Scheduler Bypass and How Does Task Scheduler Works (#521) * TBB DOC : Dev Guide: Task Scheduler Bypass and How Task Scheduler Works Signed-off-by: Anton Potapov <anton.potapov@intel.com> Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com> (cherry picked from commit ed9d4b5) * Update intro_gsg.rst (#811) (cherry picked from commit efea993) * Update conf.py (#812) (cherry picked from commit 3859d11) * Update examples.rst (#816) (cherry picked from commit 4aa0b0b) * Update layout.html (#815) (cherry picked from commit 3e352b4) * Update RELEASE_NOTES.md for oneTBB 2021.6 (#835) (cherry picked from commit faaf43c) Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com> Co-authored-by: Anton Potapov <potapov.slash.co@gmail.com> Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com> Co-authored-by: Vladimir Serov <vladimir.serov@intel.com> Co-authored-by: Ilya Isaev <ilya.isaev@intel.com> Co-authored-by: Anton Potapov <anton.potapov@intel.com>
Following chapters were partly restored from classic TBB documentation: