-
-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
R parallel computing deadlock with openblas #75506
Comments
In my homebrew discussions post, I thought this was resolved in a new release of R, but it came back later. After some research (OpenMathLib/OpenBLAS#294), I figured this is caused by openMP after linking openblas to R (strictly speaking, by GNU's implementation of libgomp). There are 3 ways to solve this issue as I tested on my machine:
The downside of this method is extra dependency of libomp.
|
Just did more tests with the method 1 above, and it actually does not work with the current openblas 0.3.13, giving the following compilation error:
But the pre-release 0.3.14 works just fine. So if we are to use clang in conjunction with gfortran, we may wait until the next release of openblas. |
I think we shouldn't be using OpenMP at all macOS or Linux for We should still build If you can test disabling One useful comparison you could also make is by installing the cask version of R which uses the Accelerate framework by default instead of |
I'm a little unclear why you are calling This will end up spawning far more threads than your CPU can actually run in parallel, and it will put most of the threads in sleep (state S in |
I am calling Sys.setenv(OMP_NUM_THREADS=1) does not work. Every time I call I have tested require(Matrix)
set.seed(1)
a <- rnorm(2800 * 2800)
dim(a) <- c(2800, 2800)
timing <- numeric(10)
for (i in 1:10) {
invisible(gc())
timing[i] <- system.time(
b <- crossprod(a)
)[3]
}
sum(timing) On my dual core quad thread cpu, the result for |
On M1, the result for Another thing I just realized: |
It's great to see that
To set the number of threads in OpenBLAS in R, you should check out RhpcBLASctl https://cran.r-project.org/web/packages/RhpcBLASctl/index.html. I've found it works great for me. Alternatively one could use
Usually the tradeoff is that you don't want to end up spawning more threads than you have cores. When there are more threads than cores, the CPU has to keep moving threads in and out of sleep, which incurs overhead. Handling multithreading with linear algebra libraries is tricky. I've usually found that best approach is to use |
Feel free to open a pull request to disable OpenMP for OpenBLAS. I think the testing you've done is sufficient to show that there is a no performance difference when using |
I don't think this is a good idea. It could have an impact on the way other formulae use OpenBLAS, or on the way current users expect to be able to use OpenBLAS. I'm not convinced that one microbenchmark on one machine is enough to establish this is a good idea to do everywhere as well. Since this bug appears to be fixed in the next version of OpenBLAS, I think the correct thing to do here would be to apply the patches that fix it. |
Point well taken. My understanding is that calls to OpenBLAS are generally agnostic to which parallel backend is used, but I'll defer to your experience with this type of issue! |
@carlocab Is using |
We'd need a |
Updating OpenBLAS in #76474. Unfortunately, it fails to build on Mojave. |
@carlocab You might have misread my post. Upgrading OpenBLAS along does not solve the OpenMP issue. It only solves the building issue of OpenBLAS with clang. The core problem with OpenMP is that libgomp is not fork safe, while libomp is. It is actually possible to use gcc+libomp (as they are doing in Anaconda, see #50252 (comment)), but that requires some hacking. Per the comment by @fxcoudert you mentioned, the switch to libomp may cause other problems and inconvenience, so I don't really see any viable solution at this point. Probably this will change when Flang is out. But for now, you may close this issue if you wish to. |
You're right, I did misread your post. My apologies. (I have a bad habit of only skimming posts in Homebrew/core -- it's the only way I'm able to manage the number of things there is to review. 😄) Ok, it seems to me that:
Using One concern with this approach is that we'll probably have to change all the formulae that depend on OpenBLAS to also use Of course, if @fxcoudert still disagrees with my assessment here, then I'm deferring to them. Thoughts, @Homebrew/core? |
These issues really demonstrate why OpenMP support is such a pain. As I mentioned above it I believe it was only created to provide a unified parallel interface that would also work on Windows. From what I understand about how OpenBLAS is used, the software actually doing the BLAS/LAPACK calls doesn't even know if the backend is multithreaded or not, let alone whether it is using OpenMP or As I see it, this is no more speculative than switching to |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
brew gist-logs <formula>
link ORbrew config
ANDbrew doctor
outputbrew update
and am still able to reproduce my issue.brew doctor
and that did not fix my problem.What were you trying to do (and why)?
Use
mclapply
command in R to do parallel computing, which essentially callsfork(2)
What happened (include all command output)?
The 4 child processes forked by R uses 0% CPU and nothing is returned to the R console.
What did you expect to happen?
R should do the computation properly and print the returned value to console.
Step-by-step reproduction instructions (by running
brew
commands)then in R console:
The text was updated successfully, but these errors were encountered: