-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Add NUMA and thread affinity support for Unix #10938
Add NUMA and thread affinity support for Unix #10938
Conversation
Fixes #1986 |
40819cf
to
2b3318a
Compare
How are we going to test this? (The NUMA part in particular.) |
src/pal/src/numa/numa.cpp
Outdated
BOOL success = TRUE; | ||
|
||
LOGEXIT("GetNumaProcessorNodeEx returns BOOL %d\n", success); | ||
PERF_EXIT(GetNumaProcessorNodeEx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is this supposed to be GetNumaHighestNodeNumber?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for spotting that, I have fixed it.
@dotnet-bot test tizen armel cross debug build |
@jkotas, as for the individual functions, I have tested them locally on my native Linux box that has 8 cores in two NUMA nodes. Regarding different NUMA configurations testing, some can be done on a VM. As for performance testing, I don't have a clear plan yet, ideally, we could use some real world benchmark like the techempower running on a physical machine with more processors / NUMA nodes and measure performance with the NUMA stuff enabled and disabled. |
2b3318a
to
fb6129e
Compare
This change adds new PAL functions for NUMA and thread affinity support for Unix and also enables related code in GC and VM for FEATURE_PAL. It doesn't reflect the limits imposed by CGROUPS on systems with CGROUPS enables yet.
@dotnet-bot test Windows_NT x64 Debug Build and Test please |
@dotnet-bot test Ubuntu arm Cross Release Build please |
2 similar comments
@dotnet-bot test Ubuntu arm Cross Release Build please |
@dotnet-bot test Ubuntu arm Cross Release Build please |
@dotnet-bot test Ubuntu arm Cross Release Build |
@swgillespie, @adityamandaleeka could you please review? |
src/pal/src/numa/numa.cpp
Outdated
|
||
/*++ | ||
Function: | ||
FreeLookupArrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AllocateLookupArrays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sigh. Thank you for spotting that.
|
||
for (int i = 0; i < numaNodesCount; i++) | ||
{ | ||
int st = numa_node_to_cpus(i, mask); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we're not doing anything with this return value. Assert that it's 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding assert, the only failure of this function could be that the mask is not large enough, but since the mask is allocated by the numa_allocate_cpumask function, that cannot happen.
|
||
if (currentGroupCpus != 0) | ||
{ | ||
g_groupToCpuCount[currentGroup] = currentGroupCpus; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
g_groupToCpuCount[currentGroup]
is set here for whatever currentGroup
is when we get here. What if we filled a group and moved on to the next one in the for loop above (near line 175)?
Do we not need to set g_groupToCpuCount
for those groups (and also the affinity mask below)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we do. It is missing. Thank you for spotting it.
else | ||
{ | ||
// The process has affinity in more than one group, in such case | ||
// the function needs to return zero in both masks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment here says we need to return 0 in both masks, but unless I'm missing something we're not setting systemMask
to 0, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Thank you!
@adityamandaleeka I've reflected your feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This change adds new PAL functions for NUMA and thread affinity support
for Unix and also enables related code in GC and VM for FEATURE_PAL.
It doesn't reflect the limits imposed by CGROUPS on systems with
CGROUPS enables yet.