-
Notifications
You must be signed in to change notification settings - Fork 895
hwloc/base: fix opal proc locality wrt to NUMA nodes on hwloc 2.0 #7201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Both opal_hwloc_base_get_relative_locality() and _get_locality_string() iterate over hwloc levels to build the proc locality information. Unfortunately, NUMA nodes are not in those normal levels anymore since 2.0. We have to explicitly look a the special NUMA level to get that locality info. I am factorizing the core of the iterations inside dedicated "_by_depth" functions and calling them again for the NUMA level at the end of the loops. Thanks to Hatem Elshazly for reporting the NUMA communicator split failure at https://www.mail-archive.com/users@lists.open-mpi.org/msg33589.html It looks like only the opal_hwloc_base_get_locality_string() part is needed to fix that split, but there's no reason not to fix get_relative_locality() as well. Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
@bgoglin this should be backported to 4.0.x? |
Sorry didn’t read you commit message. |
@bgoglin when I try to test this PR on an x86_64 dual core haswell system it doesn't seem to be working:
|
@hppritcha Strange, I just tested again. Git master (last nightly snapshot) fails as expected, patch from this PR works fine. Using the internal hwloc. On dual-socket 12-core haswell with Cluster-on-Die (2 sockets, 2 NUMA nodes each, 6 core each). |
@bgoglin what is your mpirun command line, specifically hostfile details? I'm trying on ppc64le as well. |
There's nothing interesting in my command-line, I am running on a single node with: By the way, I added more tests to the test program to confirm that only the NUMA level was broken:
|
I also tried with 4 of those haswell nodes in a single job with 96 ranks, works fine. |
okay not sure what was going on but now works for me, at least on ARM TX2 nodes. |
not being defined. related to open-mpi#7201 Signed-off-by: Howard Pritchard <howardp@lanl.gov>
PR open-mpi#7201 broke use of hwloc 1.x series. this patches gets hwloc 1.x working again with OMPI fixes open-mpi#7362 Signed-off-by: Howard Pritchard <howardp@lanl.gov>
not being defined. related to open-mpi#7201 Signed-off-by: Howard Pritchard <howardp@lanl.gov>
This should be backported to 4.0.x and 3.1.x (I am not sure how you guys handle this)
Both opal_hwloc_base_get_relative_locality() and _get_locality_string()
iterate over hwloc levels to build the proc locality information.
Unfortunately, NUMA nodes are not in those normal levels anymore since 2.0.
We have to explicitly look a the special NUMA level to get that locality info.
I am factorizing the core of the iterations inside dedicated "_by_depth"
functions and calling them again for the NUMA level at the end of the loops.
Thanks to Hatem Elshazly for reporting the NUMA communicator split failure
at https://www.mail-archive.com/users@lists.open-mpi.org/msg33589.html
It looks like only the opal_hwloc_base_get_locality_string() part is needed
to fix that split, but there's no reason not to fix get_relative_locality()
as well.
Signed-off-by: Brice Goglin Brice.Goglin@inria.fr