-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'BLAS : Program is Terminated...' Even When Not Cross-Compiled #938
Comments
As I read #889, jeromerobert (and theoractice later in the thread) was suggesting that running such a big number of threads is not expected to bring any performance benefits, hence OpenBLAS' automatic setup is for a much smaller limit. |
If python creates 20 threads it eceeds all cpu cores you have (four) |
@brada4 Are you sure that the pthread build of openblas is unsafe to use with python threads (which are NOT the same as multiprocesses)? According to the comments in scipy's site.cfg.example file, the pthread build of openblas should be safe to use with python threading but not python multiprocessing. Also according to the comments in scipy's site.cfg.example file, openblas does not work with GNU openmp, as of gcc-4.9. Has this been fixed in recent gcc versions? |
Hmm. That comment in site.cfg.example was added to numpy's version of the file two years ago, apparently in response to their ticket numpy/numpy#654 which makes reference to a mailing list discussion from three years ago. (And the discussion was not much of a discussion, more like "i have this problem" - "yeah i know, just run with one thread only"). Edit: could be that the numpy comment is related to #85 - same user taking part there and in the aforementioned discussion though timeframe not quite right |
As if your current numpy worked so well that I am suggesting some weird hack to break it. |
Hi OpenBLAS developers,
I recently came across a test failure in python's scipy package, scipy/scipy#6422, when scipy is compiled with openblas using the configuration:
The scipy test runs several matrix math operations in 20 python threads (within a single process controlled by python's Global Interpreter Lock) and produces the error message
'BLAS : Program is Terminated. Because you tried to allocate too many memory regions'
multiple times and then segfaults.Having read the FAQ for openblas, I recompiled openblas with the additional configuration option
'NUM_THREADS=64'
, which seems to fix the problem. However, this seems to me to be an openblas bug, based on a comment from @jeromerobert in issue #889, who said this about'NUM_THREADS'
:Since I am compiling openblas myself on the same machine that is using scipy, this suggests to me that openblas is not detecting the right value of
NUM_THREADS
automatically for my CPU during the build process. However, I findwritten in Makefile.conf, which suggests that openblas is able to detect the correct number of logical cores of my CPU. So I don't understand why the
'NUM_THREADS=64'
configuration is necessary. Can you please confirm if this is an openblas bug, or if it is a "feature"?Thanks.
The text was updated successfully, but these errors were encountered: