Add RFC for creation and use of NUMA-constrained arenas #1559

aleksei-fedotov · 2024-11-12T16:52:31Z

Description

Add sub-RFC to #1535 for creation and use of NUMA-constrained arenas.

Fixes # - issue number(s) if exists

Type of change

Choose one or multiple, leave empty if none of the other choices apply

Add a respective label(s) to PR if you have permissions

bug fix - change that fixes an issue
new feature - change that adds functionality
tests - change in tests
infrastructure - change in infrastructure and CI
documentation - documentation update

Tests

added - required for new features and some bug fixes
not needed

Documentation

updated in # - add PR number
needs to be updated
not needed

Breaks backward compatibility

Yes
No
Unknown

Notify the following users

List users with @ to send notifications

Other information

rfcs/proposed/simplified_numa_support/numa-arenas-creation-and-use.org

vossmjp · 2024-11-13T15:38:50Z

rfcs/proposed/simplified_numa_support/numa-arenas-creation-and-use.org

+The example above requires new class named ~tbb::constrained_task_arena~. On one
+hand, it is a ~tbb::task_arena~ class that isolates the work execution from
+other parallel stuff executed by oneTBB. On the other hand, it is a constrained
+arena that represents an arena associated to a certain NUMA node and allows


Will there be additional functions added to create arenas constrained to core type too? Or will this be exclusively for NUMA?

I guess we need to decide on this whether it is needed right away or can be added later in future RFCs. Initially, I wanted writing RFC that addresses the specific concern about verbose and error-prone API for creation of NUMA-constrained arenas.

akukanov · 2024-11-13T19:19:13Z

rfcs/proposed/simplified_numa_support/numa-arenas-creation-and-use.org

+  in the previous bullet, but since it is a synchronization point, usually the
+  blocking call is used.
+
+The proposal below addresses these issues.


I'd prefer it not to address all these issues :)

Specifically, I believe that [4] and [5] are really orthogonal to NUMA, and need to be resolved for task_arena in general rather than only for its derived class. And even for this subclass I am not sure that "hiding" a task_group inside is the right thing to do.

Keeping task_group and task_arena separate and merely simplifying the combined use of those with some "syntax sugar" allows creating independent "work queues" in the same arena. Also we can think if a similar approach with a flow graph instead of task_group might be useful. Of course all that can be implemented with the hidden task_group, but likely at the expense of some overhead.

And it seems that if task_group is kept separate and e.g. task_arena::wait(task_group&) is added, the need for a derived class diminishes if not disappears.

The example code would be in between of the "verbose" and "concise" variants showed in the document, like this:

std::vector<tbb::task_arena> numa_arenas = tbb::initialize_constrained_arenas(/*maybe some arguments*/); std::vector<tbb::task_group> task_groups(numa_arenas.size()); for(unsigned j = 0; j < numa_arenas.size(); j++) { numa_arenas[j].enqueue( (){/*some parallel stuff*/}, task_groups[j] ); } for(unsigned j = 0; j < numa_arenas.size(); j++) { numa_arenas[j].wait( task_groups[j] ); }

or, with modern C++ (C++23 is required for std::views::zip), like this:

std::vector<tbb::task_arena> numa_arenas = tbb::initialize_constrained_arenas(/*maybe some arguments*/); std::vector<tbb::task_group> task_groups(numa_arenas.size()); for(auto& [arena, tg]: std::views::zip(numa_arenas, task_groups)) { arena.enqueue( (){/*some parallel stuff*/}, tg ); } for(auto& [arena, tg]: std::views::zip(numa_arenas, task_groups)) { arena.wait( tg ); }

I agree that coupling of separate entities is rather a bad thing. Here we would like to improve usability of current interfaces without sacrificing their flexibility. Then the proposal boils down to:

Introduce the interface that would simplify creation of arenas, each bind to its own NUMA node.

Introduce the interface that would allow to avoid common mistakes related to loading with work of such arenas.

No objections to these two goals.

But I believe the second goal can be achieved without introducing a new class, by extending the existing methods of task_arena to better integrate with task_groups. That would potentially be useful beyond NUMA scenarios and would give users more control with rather small increase in code complexity.

At the very least, it's an alternative to mention.

Updated: also regarding this:

Here we would like to improve usability of current interfaces without sacrificing their flexibility.

In fact, the proposal introduces a new arena interface with reduced flexibility :)

akukanov · 2024-11-27T17:45:08Z

rfcs/proposed/numa_support/numa-arenas-creation-and-use.org

+- [2] - Separate step for instantiation the same number of ~tbb::task_group~
+  objects, in which the actual work is going to be submitted. Note that user
+  also needs to make sure the size of ~arenas~ matches the size of
+  ~task_groups~.


Here the second sentence sounds like a rephrase of the first one, without new information or argumentation. I mean, I see no difference between "the same number of tbb::task_group objects" and "the size of arenas matches the size of task_groups"

I actually don't mind the repetition. Its easy to read over the "same number" without recognizing there's a potential error if the sizes of these two vectors don't match. So, in my opinion, the repetition highlights the potential danger here.

Fair enough, then I would rephrase as:

Suggested change

- [2] - Separate step for instantiation the same number of ~tbb::task_group~

objects, in which the actual work is going to be submitted. Note that user

also needs to make sure the size of ~arenas~ matches the size of

~task_groups~.

- [2] - The necessity to instantiate the same number of ~tbb::task_group~

objects for the actual work to be submitted; that is, the size of ~task_groups~

must match the size of ~arenas~.

akukanov · 2024-11-27T18:01:42Z

rfcs/proposed/numa_support/numa-arenas-creation-and-use.org

+  nodes. Note that user needs to make sure the indices of ~tbb::task_arena~
+  objects match corresponding indices of NUMA nodes.


The point about the need to match the indices is kind of strange. A single loop that works with several arrays/vectors is a typical pattern, you just use the loop index consistently. Moreover, with modern C++ you can rewrite the loop to not have any indices at all, e.g.

std::vector<tbb::numa_node_id> numa_indexes = tbb::info::numa_nodes(); std::vector<tbb::task_arena> arenas; // note that the size is not set std::vector<tbb::task_group> task_groups; // same for task groups for (auto idx: numa_indexes) { arenas.emplace_back( tbb::task_arena::constraints(idx) ); task_groups.emplace_back(); arenas.back().execute([&tg = task_groups.back()]{ tg.run([]{/*some parallel stuff*/}); }); }

If you meant something else, perhaps try explaining it better.

I think the point here might be not that there's a safe way to do it and a well written piece of code will do it the safe way. I think the point is that a not-so-well-written might mess it up. But this specific example (with everything in a loop body) doesn't provide much room for mismatched indices, so it doesn't seem at all likely in this case. In the more general case of task_arenas with a matching number of task_groups, it could be possible to mismatch them.

The sentence says "the user needs to make sure ...", and my objection is - no, users do not necessarily need to. And we seem to agree that this specific pattern "doesn't provide much room for mismatched indices".

My point is that we should not paint the usage worse than it really is.

vossmjp

Some optional suggestions. Otherwise, looks good to me as a starting proposal.

vossmjp · 2024-12-06T19:35:49Z

rfcs/proposed/numa_support/numa-arenas-creation-and-use.org

+- [2] - Separate step for instantiation the same number of ~tbb::task_group~
+  objects, in which the actual work is going to be submitted. Note that user
+  also needs to make sure the size of ~arenas~ matches the size of
+  ~task_groups~.


I actually don't mind the repetition. Its easy to read over the "same number" without recognizing there's a potential error if the sizes of these two vectors don't match. So, in my opinion, the repetition highlights the potential danger here.

vossmjp · 2024-12-06T19:38:32Z

rfcs/proposed/numa_support/numa-arenas-creation-and-use.org

+  nodes. Note that user needs to make sure the indices of ~tbb::task_arena~
+  objects match corresponding indices of NUMA nodes.


I think the point here might be not that there's a safe way to do it and a well written piece of code will do it the safe way. I think the point is that a not-so-well-written might mess it up. But this specific example (with everything in a loop body) doesn't provide much room for mismatched indices, so it doesn't seem at all likely in this case. In the more general case of task_arenas with a matching number of task_groups, it could be possible to mismatch them.

vossmjp · 2024-12-06T19:49:18Z

rfcs/proposed/numa_support/numa-arenas-creation-and-use.org

+  but also the loop counter ~j~ can be mistakenly captured by reference, which
+  at least results in submission of the work into incorrect ~tbb::task_group~,
+  and at most a segmentation fault, since the loop counter might not exist by
+  the time the functor starts its execution.


Since we're highlighting possible points of failure, I suppose we don't want to bring up the safer enqueue deferred tasks patten, right?

Well, I would really want to bring it up as the existing way to mitigate the issue.

aleksei-fedotov requested review from vossmjp, akukanov and pavelkumbrasev November 12, 2024 16:52

github-actions bot added the documentation label Nov 12, 2024

vossmjp reviewed Nov 13, 2024

View reviewed changes

akukanov reviewed Nov 13, 2024

View reviewed changes

Add RFC for creation and use of NUMA arenas

3a2f55b

aleksei-fedotov force-pushed the dev/aleksei-fedotov/numa-constrained-arenas-rfc branch from c25e7b1 to 3a2f55b Compare November 14, 2024 09:45

Address Mike's remarks

a96d1b4

akukanov reviewed Nov 27, 2024

View reviewed changes

vossmjp approved these changes Dec 6, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RFC for creation and use of NUMA-constrained arenas #1559

Add RFC for creation and use of NUMA-constrained arenas #1559

aleksei-fedotov commented Nov 12, 2024

vossmjp Nov 13, 2024

aleksei-fedotov Nov 14, 2024

akukanov Nov 13, 2024 •

edited

Loading

akukanov Nov 14, 2024 •

edited

Loading

aleksei-fedotov Nov 27, 2024

akukanov Nov 27, 2024 •

edited

Loading

akukanov Nov 27, 2024

vossmjp Dec 6, 2024

akukanov Dec 9, 2024

akukanov Nov 27, 2024

vossmjp Dec 6, 2024

akukanov Dec 9, 2024 •

edited

Loading

vossmjp left a comment

vossmjp Dec 6, 2024

vossmjp Dec 6, 2024

vossmjp Dec 6, 2024

akukanov Dec 9, 2024 •

edited

Loading

		nodes. Note that user needs to make sure the indices of ~tbb::task_arena~
		objects match corresponding indices of NUMA nodes.

Add RFC for creation and use of NUMA-constrained arenas #1559

Are you sure you want to change the base?

Add RFC for creation and use of NUMA-constrained arenas #1559

Conversation

aleksei-fedotov commented Nov 12, 2024

Description

Type of change

Tests

Documentation

Breaks backward compatibility

Notify the following users

Other information

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akukanov Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

akukanov Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akukanov Nov 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akukanov Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

vossmjp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akukanov Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

akukanov Nov 13, 2024 •

edited

Loading

akukanov Nov 14, 2024 •

edited

Loading

akukanov Nov 27, 2024 •

edited

Loading

akukanov Dec 9, 2024 •

edited

Loading

akukanov Dec 9, 2024 •

edited

Loading