Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JobObject to restore process binding on Windows in the x86 backend? #471

Open
bgoglin opened this issue Jun 3, 2021 · 0 comments
Open

Comments

@bgoglin
Copy link
Contributor

bgoglin commented Jun 3, 2021

Issue #158 about process binding not being properly restored is significantly mitigated in 2.5 since there are topology flags to avoid binding changes, or restrict them inside the current process binding.

For the record, the issue is that moving a thread to a different processor group during x86 discovery causes the process to be attached to multiple groups (at least the new and the previous one, even if there are no thread in the old one anymore). This can be observed with GetProcessGroupAffinity():

USHORT n, m[2];
n = 2;
GetProcessGroupAffinity(GetCurrentProcess(), &n, m);
printf("I am in %u groups %u %u\n", n, m[0], m[1]);

Process binding cannot be set anymore because SetProcessAffinityMask() requires a single process group. One way to bring the process back to a single group is with JobObject:

HANDLE job = CreateJobObject(NULL, NULL);
GROUP_AFFINITY aff[1];
memset(&aff, 0, sizeof(aff));
aff[0].Mask = 0x00ff00;
aff[0].Group = 1;
if (SetInformationJobObject(job, JobObjectGroupInformationEx, &aff, sizeof(aff)))
    printf("SetInformationJobObject OK\n");
else
    printf("SetInformationJobObject Failed\n");
if (AssignProcessToJobObject(job, GetCurrentProcess()))
    printf("Assign OK\n");
else
    printf("Assign Failed\n");

Important things to note/discuss:

  • if we apply the above code twice with different groups (for instance if load 2 hwloc topologies), only the first one applies, the second job is placed in the hierarchy. if they are compatible, the smallest one is used. if they are not (different groups), the first one only is applied.
  • we cannot remove the process from the job anymore, it might restrict future binding changes in the application or if reloading another topology
  • job affinity can be changed after attaching a process, and it will impact the process affinity immediately (even if the new affinity is larger than the previous one).

It may mean that we'd have to remember the hwloc "job" during the entire process duration and reuse it later. But things might get crazy when we create another process since it'll end up in the same job (without knowing it?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant