-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock when callling CUDA.jl in an adopted thread while blocking the main thread #2449
Comments
Unrelated to CUDA.jl. MWE: #include <julia.h>
#include <pthread.h>
typedef void (*julia_callback)();
void call_directly(julia_callback callback) {
printf("Calling Julia directly\n");
callback();
}
void *thread_function(void* callback) {
printf("Calling Julia from thread\n");
((julia_callback)callback)();
return NULL;
}
void call_on_thread(julia_callback callback) {
printf("Creating thread\n");
pthread_t thread;
pthread_create(&thread, NULL, thread_function, callback);
pthread_join(thread, NULL);
} function callback()::Cvoid
println(Core.stdout, "Calling the GC")
GC.gc()
println(Core.stdout, "GC call done")
end
callback_ptr = @cfunction(callback, Cvoid, ())
ccall((:call_directly, "./wip.so"), Cvoid, (Ptr{Cvoid},), callback_ptr)
println()
gc_state = @ccall(jl_gc_safe_enter()::Int8)
ccall((:call_on_thread, "./wip.so"), Cvoid, (Ptr{Cvoid},), callback_ptr)
@ccall(jl_gc_safe_leave(gc_state::Int8)::Cvoid)
println("Done") This blocks waiting for the GC. The reason is that you're joining the thread immediately after creating it, resulting in The solution is either not to have your ccall block (e.g. by not joining the thread), or by indicating that the blocking ccall is safe to transition into GC from. This requires two modifications: surrounding the function main()
callback_ptr = @cfunction(callback, Cvoid, ())
ccall((:call_directly, "./wip.so"), Cvoid, (Ptr{Cvoid},), callback_ptr)
println()
gc_state = @ccall(jl_gc_safe_enter()::Int8)
ccall((:call_on_thread, "./wip.so"), Cvoid, (Ptr{Cvoid},), callback_ptr)
@ccall(jl_gc_safe_leave(gc_state::Int8)::Cvoid)
println("Done")
end
isinteractive() || main() |
Thank you for the response. I have tried the solution. However, when I have CUDA.@sync inside call_on_thread, it hangs in execution. I tried to attach gdb process. However, it is having many threads from Julia domain and Pthread is having a single thread. Nearly ~35 Julia threads with gdb backtrace is given below.
Another thread backtrace
Any suggestion to resolve this error. |
Works fine here; putting |
I ran the MWE as it is without any CUDA statements. c_mwe.c:
julia_cuda_mwe.jl:
Here are the compilation and run commands in makefile.
Output:
It hangs at GC.gc(). |
I have replaced the Julia code as given below to resolve the hanging issue with GC.
It is working. I am still wondering why it didn't work without a function. However, CUDA is still having an issue. I will upload MWE with minimal CUDA code in my next post. |
Here is the MWE with CUDA code to illustrate the new findings. c_cuda.c:
julia_cuda.jl:
In the above code, it works good if I set disable = false in the Julia main() function. Here is the makefile to build and run.
|
What do you mean by that? Please always include relevant output to explain your issue. |
Here is the output for two scenarios.
Output:
Output:
|
I would like to reopen this issue as it fails for Scenario-2. |
When I debug the code further, it is getting hanged during execution in the GPUCompiler when it calls "
This issue exists even when I use AMDGPU instead of CUDA. I have reported this issue in GPUCompiler repository now. Please collaborate with us to fix this issue. |
Reduced to: #include <julia.h>
#include <pthread.h>
typedef void (*julia_callback)();
void call_directly(julia_callback callback) {
printf("Calling Julia directly\n");
callback();
}
void *thread_function(void* callback) {
printf("Calling Julia from thread\n");
((julia_callback)callback)();
return NULL;
}
void call_on_thread(julia_callback callback) {
printf("Creating thread\n");
pthread_t thread;
pthread_create(&thread, NULL, thread_function, callback);
pthread_join(thread, NULL);
} function callback()::Cvoid
println("Running a command")
run(`echo 42`)
return
end
function main()
callback_ptr = @cfunction(callback, Cvoid, ())
gc_state = @ccall(jl_gc_safe_enter()::Int8)
ccall((:call_on_thread, "./wip.so"), Cvoid, (Ptr{Cvoid},), callback_ptr)
@ccall(jl_gc_safe_leave(gc_state::Int8)::Cvoid)
println("Done")
end
main() I'm pretty sure this is not guaranteed to work. Julia code can work on an foreign thread, as you're doing here, but you're also concurrently blocking the main thread of execution by calling If you really want this to work, I'd advise filing an issue on the Julia main repository. In the mean time, I would try only calling Hope this helps you resolve the issue! In any case, there isn't much we can do from the CUDA.jl side about this... |
Ah, I missed the GC lock interaction when I read this issue initially. When we enter C we shouldn't hold any locks (except the ones the user holds). So marking the thread as "GCSafe" ought to be enough. |
In that case, let's file this on the Julia repo. |
Filed upstream: JuliaLang/julia#55525 I think we can close this then, as there's nothing actionable on the CUDA.jl side. And again, as a potential workaround, don't have the main thread block. For example, using the non-portable API: #define _GNU_SOURCE
#include <julia.h>
#include <pthread.h>
typedef void (*julia_callback)();
void call_directly(julia_callback callback) {
printf("Calling Julia directly\n");
callback();
}
void *thread_function(void* callback) {
printf("Calling Julia from thread\n");
((julia_callback)callback)();
return NULL;
}
pthread_t thread;
void call_on_thread(julia_callback callback) {
printf("Creating thread\n");
pthread_create(&thread, NULL, thread_function, callback);
}
int wait_for_thread() {
return pthread_tryjoin_np(thread, NULL);
} function callback()::Cvoid
println("Running a command")
run(`echo 42`)
return
end
function main()
callback_ptr = @cfunction(callback, Cvoid, ())
ccall((:call_on_thread, "./wip.so"), Cvoid, (Ptr{Cvoid},), callback_ptr)
ret = -1
while ret != 0
ret = ccall((:wait_for_thread, "./wip.so"), Cint, ())
yield()
end
println("Done")
end
main() |
Sanity checks (read this first, then remove this section)
Make sure you're reporting a bug; for general questions, please use Discourse or
Slack.
If you're dealing with a performance issue, make sure you disable scalar iteration
(
CUDA.allowscalar(false)
). Only file an issue if that shows scalar iteration happeningin CUDA.jl or Base Julia, as opposed to your own code.
If you're seeing an error message, follow the error message instructions, if any
(e.g.
inspect code with @device_code_warntype
). If you can't solve the problem usingthat information, make sure to post it as part of the issue.
Always ensure you're using the latest version of CUDA.jl, and if possible, please
check the master branch to see if your issue hasn't been resolved yet.
If your bug is still valid, please go ahead and fill out the template below.
Describe the bug
I am facing a weird problem in our application. We have a julia function calling a C function, which is creating a pthread and calling Julia CUDA kernel. I have created a small example to illustrate and reproduce the problem. Unfortunately, we are not able to make it even more simpler. This is the simplest example to reproduce the issue.
The "
call_c_function_direct
” Julia function calls the C function “c_function_direct
”, which calls Julia CUDA kernel “ccall_saxpy() -> saxpy_kernel()
”. It works without any issue.However, when I create a pthread inside C function and call Julia CUDA kernel, it hangs the execution and no useful stack trace is available.
The "
call_c_function_pthread
” Julia function calls the C function “c_function_pthread
”, which creates a pthread and calls Julia CUDA kernel “ccall_saxpy() -> saxpy_kernel()
”. It hangs the execution when it calls @cuda saxpy_kernel”.To control the execution of Julia CUDA kernel either through direct or pthread based, a Julia variable is added in the file “julia_cuda.jl” with “
direct =
true”. You can set it to false to run using pthread.To reproduce
The Minimal Working Example (MWE) for this bug:
File: julia_code.jl
C File: c_cuda.c
Build command and run ``` $ gcc -g -O0 -fPIC -shared -o libcfunction.so c_cuda.c -I$(JULIA)/include/julia -L$(JULIA)/lib -ljulia -lpthread -I$(NVHPC_ROOT)/cuda/include -L$(NVHPC_ROOT)/cuda/lib64 -lcuda -lcudart $ julia julia_cuda.jl ```
Expected behavior
A clear and concise description of what you expected to happen.
Version info
Details on Julia: 1.10.4
Details on CUDA:
The text was updated successfully, but these errors were encountered: