-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Hi OpenBLAS developers,
I recently came across a test failure in python's scipy package, scipy/scipy#6422, when scipy is compiled with openblas using the configuration:
"FC=gfortran USE_OPENMP=0 USE_THREAD=1 MAJOR_VERSION=3 NO_LAPACK=0 BUILD_LAPACK_DEPRECATED=1"
The scipy test runs several matrix math operations in 20 python threads (within a single process controlled by python's Global Interpreter Lock) and produces the error message
'BLAS : Program is Terminated. Because you tried to allocate too many memory regions'
multiple times and then segfaults.
Having read the FAQ for openblas, I recompiled openblas with the additional configuration option 'NUM_THREADS=64'
, which seems to fix the problem. However, this seems to me to be an openblas bug, based on a comment from @jeromerobert in issue #889, who said this about 'NUM_THREADS'
:
... It should be automatically detected by the build system. The only reason to manually set NUM_THREADS is to build OpenBLAS for an other machine which have more physical cores than the current machine.
Since I am compiling openblas myself on the same machine that is using scipy, this suggests to me that openblas is not detecting the right value of NUM_THREADS
automatically for my CPU during the build process. However, I find
...
CORE=SANDYBRIDGE
LIBCORE=sandybridge
NUM_CORES=8
...
written in Makefile.conf, which suggests that openblas is able to detect the correct number of logical cores of my CPU. So I don't understand why the 'NUM_THREADS=64'
configuration is necessary. Can you please confirm if this is an openblas bug, or if it is a "feature"?
Thanks.