Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for :foreigncall to transition to GC safe automatically #49933

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

vchuravy
Copy link
Member

This has been bouncing around as a idea for a while.
One of the challenges around time-to-safepoint has been Julia code
that is calling libraries.

Since foreign code will not include safepoints we see increased latency
when one thread is running a foreign-call and another wants to trigger GC.

The open design question here is:

  • Do we expose this as an option the user must "opt-in", e.g. by using a
    keyword arg to @ccall or a specific calling-convetion.
  • Or do we turn this on for all ccall, except for Julia runtime calls.

There is relativly little code outside the Julia runtime that needs to be "GC unsafe",
exception are programs that directly use the Julia C-API. Incidentially jl_adopt_thread
and @cfunction/@ccallable do the right thing and transition to "GC unsafe", regardless
of what state the thread currently is in.

I still need to figure out how to reliably detect Julia runtime calls, but I think we can
switch all other calls to "GC safe". We should also consider optimizations that mark large
regions of code without Julia runtime interactions as "GC safe" in particular numeric
for-loops.

@gbaraldi
Copy link
Member

Since we already have the trampoline stuff, what about adding a safepoint into every call into the julia runtime? Similar to the safepoint in JIT function calls?

@Taaitaaiger
Copy link
Contributor

How would this affect packages that use CxxWrap or jlrs to provide access to C++ and Rust libraries, which call back into into Julia from foreign functions?

@vchuravy vchuravy mentioned this pull request Oct 3, 2023
@fingolfin
Copy link
Member

but I think we can
switch all other calls to "GC safe"

All in Julia itself? Perhaps.

All in all Julia packages? Definitely not! That would be a breaking change.

@kpamnany
Copy link
Contributor

kpamnany commented Oct 4, 2023

All in all Julia packages? Definitely not! That would be a breaking change.

Can you point at a for-instance @fingolfin?

@gbaraldi
Copy link
Member

gbaraldi commented Oct 4, 2023

The not common but existing case is for julia code to call c code that calls julia runtime functions.

@fingolfin
Copy link
Member

@kpamnany any package using a JLL linking against libjulia would be a first candidate, that includes all packages using CxxWrap (I count 29 in the general registry). For myself, GAP.jl.

I am not saying this is an impossible change, but only if there is a well thought out migration strategy that is coordinated with all stakeholders. But just saying "it probably won't affect that many packages, let's just do it", without a full analysis and without involving the community, repeats the "main" debacle. Let's not go there.

Overall, I prefer "code that is not as fast as it could be but is correct" over "code that is perhaps a bit faster, or not, but will occasionally crash or produce garbage...

A more viable alternative would be an opt-in solution -- but then of course many packages will miss this opportunity. A pity, but it wouldn't leave us worse than we are right now, new packages could benefit from it.

(If we had something like Rust editions, then for "new" packages this could become the default, I guess...?)

src/ccall.cpp Outdated Show resolved Hide resolved
@vchuravy
Copy link
Member Author

@eval function put(msg)
    cmsg = Base.cconvert(Cstring, msg)
    ptr = Base.unsafe_convert(Cstring, cmsg)
    $(Expr(:foreigncall, QuoteNode(:puts), Cint, Core.svec(Cstring), 0, true, QuoteNode(:ccall), :ptr, :cmsg))
end

Now is:

L28:                                              ; preds = %L15
; └
;  @ /home/vchuravy/src/julia2/gc_safe.jl:4 within `put`
  %ptls_field18 = getelementptr inbounds {}**, {}*** %tls_pgcstack, i64 2
  %ptls_load19 = load {}**, {}*** %ptls_field18, align 8
  %12 = bitcast {}** %ptls_load19 to i8*
  %gc_state = getelementptr inbounds i8, i8* %12, i64 25
  %13 = load atomic i8, i8* %gc_state monotonic, align 1
  store atomic i8 2, i8* %gc_state release, align 8
  %14 = call i32 bitcast (void ()* @jlplt_puts_689_got.jit to i32 (i64)*)(i64 %string_ptr) #10
  store atomic i8 %13, i8* %gc_state release, align 8
  %15 = getelementptr inbounds {}*, {}** %ptls_load19, i64 2
  %16 = bitcast {}** %15 to i64**
  %safepoint = load i64*, i64** %16, align 8
  fence syncscope("singlethread") seq_cst
  %17 = load volatile i64, i64* %safepoint, align 8
  fence syncscope("singlethread") seq_cst
  %frame.prev14 = load {}*, {}** %frame.prev, align 8
  %18 = bitcast {}*** %tls_pgcstack to {}**
  store {}* %frame.prev14, {}** %18, align 8
  ret i32 %14

The safepoint emission at that point needs to be cleaner.

@vtjnash that's probably closer to what it needs to be?

Value *ptls = get_current_ptls_from_task(builder, T_size, get_current_task_from_pgcstack(builder, T_size, pgcstack), tbaa_gcframe);
Value *last_gc_state = emit_gc_safe_enter(builder, T_size, ptls, false);
builder.SetInsertPoint(CI->getNextNode());
// Can't use `emit_gc_safe_exit` since that wan'ts to emit some branches...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing you would need to make a new block for this specific emit call, then fix up the IR by calling SplitBlockAndInsertIfThenElse and splice the instructions back into their respective parts of the IfThenElse blocks and delete the new blocks

Copy link
Member Author

@vchuravy vchuravy Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was trying that and had an infinite loop in the iteration here so I am not sure if this is the right place.

I would also like to support a more region based approach, since it would be interesting for long running for loops

@kpamnany
Copy link
Contributor

kpamnany commented Nov 2, 2023

Note for when this lands: we should redo JuliaLang/MbedTLS.jl#265.

maleadt added a commit to JuliaGPU/CUDA.jl that referenced this pull request Feb 14, 2024
This allows the GC to run while potentially blocking in a CUDA library.
To make this safe, callbacks into Julia should again transition to
GC-unsafe mode.

It should be reimplemented when JuliaLang/julia#49933 lands.

Co-authored-by: Tim Besard <tim.besard@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants