-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IReduce!
and IAllreduce!
#827
base: master
Are you sure you want to change the base?
Conversation
36530c4
to
fafe00e
Compare
Mmm... only the CUDA tests fail. I feared this was a Julia-side issue, but no, it is simply that This is a known issue (open-mpi/ompi#9845), and it also happens with It is very surprising to me that the ROCm support apparently covers all non-blocking ops, but not CUDA. What would be the best course of action? Merge anyway and let users stumble upon an unhelpful segfault? Or would a warning (if using OpenMPI + CUDA is loaded) be enough? |
Yeah we don't currently have a good mechanism to declare which operations can and can not take GPU memory. We certainly need to branch in the tests, but I don't think we have prior art for this. @simonbyrne any ideas? |
Unfortunately it is probably implementation (and configuration) dependent, so I don't think we can provide a complete solution. My best suggestion would be to make it so the test suite can soft fail and report which operations are supported. If you want something easy that does work, the simplest option is to use the regular blocking operation spawned on a separate thread: task = Threads.@spawn MPI.Allreduce(....)
# other work
wait(task) If your other work involves MPI ops, you will also need to |
fafe00e
to
311629f
Compare
Thank you @PetrKryslUCSD for reminding me about this PR. My original solution to detect support for The main problem is that MPI raises a SIGSEGV when My proposed solution is to call those problematic functions at the beginning of the test suite in a separate process, and set env vars appropriately when they work. Since some RMA functions and other non-blocking collectives suffer from the same problem, if we were to add them to MPI.jl, it is easy to extend this to know if they work. |
The CUDA and AMDGPU tests on Julia 1.10 are all passing, however it detects The tests failing with Julia nightly on CPU and for Julia 1.6-1.8 with CUDA seems unrelated to this PR. |
I just realized that tests on AMDGPU have Is it related to the fact that those operations are unsupported on ROCm? Because if they are we could detect this using my method and exclude those tests automatically. |
Added basic nonblocking reductions, alongside with some tests.