-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot bind to several memories #601
Comments
Hello. Fasten your seat belt, this is a bit complicated. There are two main ways to bind memory on Linux, MPOL_BIND and MPOL_PREFERRED (there's also INTERLEAVE but it doesn't matter here). numactl uses BIND by default (if the nodes you give are full, allocation fails). hwloc uses PREFERRED by default (if the nodes you give are full, allocation falls back to other nodes). You may pass the STRICT flag (or --strict on the command-line) to switch hwloc to BIND instead of PREFERRED. Strictly speaking, the default hwloc isn't wrong: it's allocating memory inside the mask you've given, but the capacity is indeed more limited than expected, but it has a fallback if the capacity is exceeded. The reason PREFERRED shows a single node is that the old implementation in Linux basically ignores all nodes but the first one in the mask you give. There's a new implementation called MPOL_PREFERRED_MANY in kernel 5.15 which would likely fix your report, but I guess it's not available in your redhat kernel. If you try "numactl -p 1,3" instead of "numactl --membind 1,3", this tells numactl to use PREFERRED instead of BIND, and I guess it will fail because you're giving multiple nodes and the kernel doesn't support it. |
Hello, thanks for the details. The Those binding modes (bind, preferred, interleave) are detailed in some documentation I have read recently (https://www.intel.com/content/www/us/en/content-details/769060/intel-xeon-cpu-max-series-configuration-and-tuning-guide.html?DocID=769060 page 27, section 6.2.1). This document mentions 4 modes for binding to memories using numactl:
I was expecting |
Because hwloc is not Linux specific :/ We try to keep the API portable (and simple). Other operating systems expose different policies, finding some sort of common denominator was very difficult. |
I see, thanks again for your time. I definetly agree the doc would greatly benefit from such additions 👍 My last 2 cents: if you are going to expose a linux specific flag (e.g., |
By the way, if there are some places in the doc that you already found unclear, please me know. Usually these kinds of clarifications go in the hwloc-bind manpage and in the introduction of the "Memory binding" section in hwloc.h. I may add something in the doxygen text too ("CPU and Memory Binding Overview"). |
Refs #601. Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
- subdivide in sections - add an introduction - talk about portability and policies - more cross-references Refs #601 Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
I pushed several updates to the doc in master, v2.x and v2.9, hopefully that will help avoid the confusion between policies. |
What version of hwloc are you using?
hwloc & lstopo (version 2.9.2)
LSTOPOARGS="--merge --no-legend --no-io --ignore pci --ignore net --of svg"
hwloc-bind $BINDING_ARGS lstopo-no-graphics $LSTOPOARGS --pid 0
Which operating system and hardware are you running on?
Linux 4.18.0-372.26.1.el8_6.x86_64 #1 SMP Sat Aug 27 02:44:20 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
Intel Xeon 8358 (Ice Lake). Topology:
Details of the problem
I am trying to bind processes so they use dedicated set of memory banks (in this case, 1 and 3).
My wish would be to use something like this:
However such commands will result in the first memory place to be used is used only (e.g., numa:1 only).
For instance :
Produces this output:
Whereas I was expecting all the odd numa memories to be used.
Just to make sure this is not a system limitation, I ran the same lstopo with numactl binding:
which gives me:
Where we can see that numactl was able to memory bind to the 2 memories, as I expect.
Did I miss some argument in the
hwloc-bind
call ?Best.
The text was updated successfully, but these errors were encountered: