-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: handle segfault with hierarchical kernel when team size is larger than available threads #185
Comments
More specifically, if the number of threads set is smaller than the size of a single team, it seems that a segfault is likely. |
Which also reminds me--we may need to use |
I think Christian mentioned that even in the core C++ this should produce a proper runtime error, not segfault. |
hmmm so far I did not get the debugger to help me much but rather drain me in messages |
So, the debugger tells that Kokkos core errors out with |
It kind of is a complex control flow if the library we load calls abort and it propagates through different layers of python. To me this looks neither transparent nor would I know anything we can do once the abort is called... |
As noted here: #146 (comment)
If you write a PyKokkos kernel that uses a team barrier synchronization and probably other related hierarchical parallelism features, it seems that you can get a hard segfault if you have
OMP_NUM_THREADS=1
in your environment.While Kokkos core probably has a case for not behaving so well here, since the code is already compiled, if we have ahead-of-compile-time knowledge of the number of threads that will be available, I wonder if we should do something more useful than segfaulting by default.
I checked that deleting the barrier syncs isn't sufficient to make the segfault go away, so something broader about the hierarchical kernel is likely to blame.
Copy of the crashing workunit below the fold, in case it gets mutated a lot in the matching PR:
The text was updated successfully, but these errors were encountered: