Allow multiple OpenBLAS OpenMP calls to run in parallel #138
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Checklist
0
(if the version changed)conda-smithy
(Use the phrase@conda-forge-admin, please rerender
in a comment in this PR for automated rerendering)This PR adds the required build flag to allow OpenMP-capable builds of OpenBLAS to run multiple OpenMP calls at the same time, see:
https://github.com/xianyi/OpenBLAS/blob/28a24a4d4fe3d9cd838c37504dee2493bc10a5e5/Makefile.rule#L93-L98
We would like this functionality in https://github.com/nv-legate/cunumeric, to allow us to instantiate a separate OpenMP group per NUMA domain.
NUM_PARALLEL=32
is the setting we are currently using in cuNumeric (we are building OpenBLAS from source, but would much prefer to be using the conda package), so I just copied that. If you think that is too high, we would be happy with 8, or even 2 if need be. AFAICT the main effect of increasing this number is increased internal memory usage (unclear how much):https://github.com/xianyi/OpenBLAS/blob/28a24a4d4fe3d9cd838c37504dee2493bc10a5e5/driver/others/blas_server_omp.c#L72-L77
but no change in behavior when no more than 1 concurrent OpenMP calls are active at any time.
I don't know much about the build number versioning above, please let me know how this should be updated.