Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TBB DOC : Dev Guide: Task Scheduler Bypass and How Does Task Scheduler Works #521

Conversation

anton-potapov
Copy link
Contributor

@anton-potapov anton-potapov commented Aug 3, 2021

Following chapters were partly restored from classic TBB documentation:

Works

Signed-off-by: Anton Potapov <anton.potapov@intel.com>
=============================

The scheduler runs tasks in a way that tends to minimize both memory
demands and cross-thread communication. The intuition is that a balance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by intuition?

doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/Task_Scheduler_Bypass.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/Task_Scheduler_Bypass.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/Task_Scheduler_Bypass.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/Task_Scheduler_Bypass.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/Task_Scheduler_Bypass.rst Outdated Show resolved Hide resolved
Comment on lines 6 to 10
The scheduler runs tasks in a way that tends to minimize both memory
demands and cross-thread communication. The intuition is that a balance
must be reached between depth-first and breadth-first execution.
Assuming that the tree is finite, depth-first is best for sequential
execution for the following reasons:
Copy link
Contributor

@alexey-katranov alexey-katranov Aug 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be tree decomposition oriented. While TBB scheduler and tasking approach was (almost) tree structure oriented, the oneTBB scheduler is completely unaware of task dependencies and we need to think how to explain it better. Maybe we can first say about organization of queues and so on but then explain how it is mapped on tree decomposition that is widely used in parallel algorithms but not in flow graph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

anton-potapov and others added 2 commits March 9, 2022 17:00
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
of nodes, but only a linear number can exist at the same time, because it creates a stack of other ready
tasks.

Each thread has its own deque[8] of tasks that are ready to run. When a
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that reference to definition of deque is missing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean it is missing in our documentation or in this particular topic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[8] means that reference was in the original documentation of "classic" TBB.

current documentation does not have this reference as well

So I have just put a note to not forget to add it.

 Works

- Merged "How Scheduler Works" and "Scheduler Summary"
@anton-potapov anton-potapov force-pushed the task_group_extension_doc_scheduller_bypass branch from 38957a6 to 0555ace Compare March 10, 2022 09:19
doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
of nodes, but only a linear number can exist at the same time, because it creates a stack of other ready
tasks.

Each thread has its own deque[8] of tasks that are ready to run. When a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean it is missing in our documentation or in this particular topic?

doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
doc/main/tbb_userguide/How_Does_Task_Scheduler_Works.rst Outdated Show resolved Hide resolved
- Take a task from the thread's deque, unless it is stolen by another thread.

Steps 1 and 3 introduce unnecessary deque operations or, even worse, allow stealing that can hurt
locality without adding significant parallelism. These problems can be avoided by returning next task to execute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
locality without adding significant parallelism. These problems can be avoided by returning next task to execute
locality without adding significant parallelism. These problems can be avoided by returning the next task to execute

Copy link
Contributor

@alexey-katranov alexey-katranov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Steps 1 and 3 introduce unnecessary deque operations or, even worse, allow stealing that can hurt
locality without adding significant parallelism. These problems can be avoided by returning next task to execute
instead of spawning it. When using the method shown in :doc:`How Task Scheduling Works <How_Does_Task_Scheduler_Works>`,
the returned task becomes the next task executed by the thread. Furthermore, this approach almost guarantees that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The task does not always become the the next task due to Flow Graph priorities. Maybe the returned task is considered first to be executed by the thread`?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, thanks for idea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chnged


While the task scheduler is not bound to any particular type of the parallelism,
it was designed to works efficiently for fork-join parallelism with lots of forks
(this type of parallelism is typical to parallel algorithms like parallel_for).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to convert plain text parallel_for into reference

Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
- Take a task from the thread's deque, unless it is stolen by another thread.

Steps 1 and 3 introduce unnecessary deque operations or, even worse, allow stealing that can hurt
locality without adding significant parallelism. These problems can be avoided by returning next task to execute
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to mention "Bypass"

the returned task becomes the next task executed by the thread. Furthermore, this approach almost guarantees that
the task is executed by the current thread and not by any other thread.

Please note that at the moment the only way to use this optimization is to use oreview feature of ``onepai::tbb::task_group``
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add reference to the according topic

anton-potapov and others added 2 commits March 17, 2022 19:14
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
aepanchi
aepanchi previously approved these changes Mar 18, 2022
Let's consider the mapping of fork-join parallelism on the task scheduler in more detail.

The scheduler runs tasks in a way that tries to achieve several targets simultaneously:
- utilize as more threads as possible, to achieve actual parallelism
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- utilize as more threads as possible, to achieve actual parallelism
- Utilize as more threads as possible, to achieve actual parallelism

@anton-potapov anton-potapov force-pushed the task_group_extension_doc_scheduller_bypass branch from 38f593b to a774291 Compare March 18, 2022 17:40
@anton-potapov anton-potapov merged commit ed9d4b5 into uxlfoundation:master Mar 21, 2022
ValentinaKats pushed a commit to ValentinaKats/oneTBB that referenced this pull request May 20, 2022
…r Works (uxlfoundation#521)

* TBB DOC : Dev Guide: Task Scheduler Bypass and How Task Scheduler
Works

Signed-off-by: Anton Potapov <anton.potapov@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>

(cherry picked from commit ed9d4b5)
timmiesmith pushed a commit that referenced this pull request Aug 9, 2022
* Update pull_request_template.md (#751)

Signed-off-by: Alexandra Epanchinzeva alexandra.epanchinzeva@intel.com
(cherry picked from commit 4eda0f9)

* Update CONTRIBUTING.md (#765)

(cherry picked from commit e274a9e)

* Documentation update for unpreview `task_handle` and related stuff (#755)

* Unpreview task_handle and related stuff

Signed-off-by: Anton Potapov <anton.potapov@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
(cherry picked from commit fd76f45)

* Update conf.py (#774)

(cherry picked from commit 6666292)

* Actualize documentation about proportional splitting constructor in Range (#728)

Actualize documentation about proportional splitting

Signed-off-by: Fedotov, Aleksei <aleksei.fedotov@intel.com>
(cherry picked from commit e5cbe50)

* Update doc structure and add new files (#791)

(cherry picked from commit ce0d258)

* Instruction for building the docs locally  (#778)

(cherry picked from commit e386960)

* Document a way to flow graph can be attached to arbitrary task_arena (#785)

* Document a way to flow graph can be attached to arbitrary task_arena

task_arena interface provides mechanisms to guide tasks execution within the arena by
setting the preferred computation units or restricting part of computation units. In some
cases, you may want to use mechanisms within a flow graph.

Signed-off-by: Serov, Vladimir <vladimir.serov@intel.com>
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Alexandra Epanchinzeva <alexandra.epanchinzeva@intel.com>
(cherry picked from commit a938322)

* Add topic about "Lazy Initiliazation" pattern to Design patterns (#790)

New topic about Lazy initialization pattern and how it can be implemented using oneapi::tbb::collaborative_call_once has been added.

Signed-off-by: Ilya Isaev <ilya.isaev@intel.com>
(cherry picked from commit 1da8f0d)

* Update Get Started Guide (#803)

(cherry picked from commit 0502372)

* Update intro_gsg.rst (#808)

(cherry picked from commit 2c4f282)

* Update conf.py (#810)

(cherry picked from commit 0a0a592)

* TBB DOC : Dev Guide: Task Scheduler Bypass and How Does Task Scheduler Works (#521)

* TBB DOC : Dev Guide: Task Scheduler Bypass and How Task Scheduler
Works

Signed-off-by: Anton Potapov <anton.potapov@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>

(cherry picked from commit ed9d4b5)

* Update intro_gsg.rst (#811)

(cherry picked from commit efea993)

* Update conf.py (#812)

(cherry picked from commit 3859d11)

* Update examples.rst (#816)

(cherry picked from commit 4aa0b0b)

* Update layout.html (#815)

(cherry picked from commit 3e352b4)

* Update RELEASE_NOTES.md for oneTBB 2021.6 (#835)

(cherry picked from commit faaf43c)

Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
Co-authored-by: Anton Potapov <potapov.slash.co@gmail.com>
Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>
Co-authored-by: Vladimir Serov <vladimir.serov@intel.com>
Co-authored-by: Ilya Isaev <ilya.isaev@intel.com>
Co-authored-by: Anton Potapov <anton.potapov@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants