Fix CPU binding on HCC Windows systems #1575
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Current implementation of thread pinning with HWLOC usage on Windows is not scalable and already does not work on HCC systems like SPR with multiple NUMA nodes. It doesn't work because
hwloc_set_cpubind
uses SetThreadGroupAffinity on Windows and this API doesn't allow setting affinity across multiple processor groups.This patch fixes binding by shrinking the mask to fit a particular processor group based on slot number in arena.
Fixes # - issue number(s) if exists
Type of change
Choose one or multiple, leave empty if none of the other choices apply
Add a respective label(s) to PR if you have permissions
Tests
Documentation
Breaks backward compatibility
Notify the following users
List users with
@
to send notificationsOther information