-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tasks-debugging: make it possible to get the backtrace of a task #32283
Conversation
As an aside, I would love to be able to send |
We've usually done that on SIGINFO (or SIGUSR1 for linux) |
The nice thing about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks useful. I won't claim to follow what's really going on in jl_backtrace_fiber
😬
What needs to be done here to get this merged in? I would really like to be able to debug my tasks to see if there are tasks escaping, taking up memory, etc... |
Bump. Having this would be really useful for debugging pkg server issues which are fairly urgent for 1.5 (yes, this won't be in 1.5 but the server doesn't have to run 1.5, it's the fact that once 1.5 is released, all Pkg clients will talk to the server by default that makes this urgent). |
Okay, I took a slightly different approach and fixed the JL_HAVE_UNW_CONTEXT backend, and made that the default for most platforms. The downside is that it's not quite as fast:
compare with #13099 (comment) —this gives up most of the speed up we got from avoiding COPY_STACKS. But compare also to:
which we do on (nearly) every task switch, and the impact should not be too bad overall. With additional effort per-platform (already done on Windows, where libunwind doesn't exist), we can resume using the current setjmp. |
a31d43b
to
c720128
Compare
What constitutes a 'live' task? julia> ccall(:jl_live_tasks, Vector, ())
3-element Vector{Any}:
Task (runnable) @0x00007f8905200010
Task (runnable) @0x00007f8905200450
Task (runnable) @0x00007f890599fb90
julia> @async nothing
Task (done) @0x00007f8908950450
julia> ccall(:jl_live_tasks, Vector, ())
4-element Vector{Any}:
Task (runnable) @0x00007f8905200010
Task (runnable) @0x00007f8905200450
Task (runnable) @0x00007f890599fb90
Task (done) @0x00007f8908950450 I was kind of expecting that 4th task to not be returned. |
okay, I've fixed that (removed dead tasks), though linux performance cost to making this the default is seeming unacceptable there (the numbers above were for masOS):
profile shows that libunwind is doing a lot of dumb stuff inside unw_resume. |
This should work for any non-copy stack task. To make it work better, this now switches to the JL_HAVE_UNW_CONTEXT by default for some platforms. Also export the list of all live (currently running or suspended) tasks which have real stacks (the non-copy-stack tasks) which were started by the current thread.
__tsan_destroy_fiber(ctx->tsan_state); | ||
static inline void tsan_destroy_ctx(jl_ptls_t ptls, void *state) { | ||
if (state != &ptls->root_task->state) { | ||
__tsan_destroy_fiber(ctx->state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line, and line 60, refer to ctx
, but that variable is no longer defined, so this cannot possibly compile. What's supposed to happen here?
…iaLang#32283) This should work for any non-copy stack task. To make it work better, this now switches to the JL_HAVE_UNW_CONTEXT by default for some platforms. Also export the list of all live (currently running or suspended) tasks which have real stacks (the non-copy-stack tasks) which were started by the current thread.
debugging-only, this adds some internal utility functions for introspecting task state while the process is stopped
this should work for any non-copy stack task,
although gdb puts in a special hook to the longjmp
which causes it to crash when gdb returns :/
to work around that gdb issue, you may want to
use the more compatible task switching backend:
This also exports the list of all live (currently running or suspended)
tasks which have real stacks (the non-copy-stack tasks) on the current
thread, so that they can be read from gdb or Julia.