Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Once type for pthread_once like functionality #55439

Closed
wants to merge 1 commit into from

Conversation

gbaraldi
Copy link
Member

@gbaraldi gbaraldi commented Aug 9, 2024

This is a way to start the discussion. It uses a very pthread_once API. We might want to go fancier and also maybe move the callback to inside the struct. I'm not sure what do people prefer

!!! compat "Julia 1.12"
This functionality requires at least Julia 1.12
"""
@inline function call_once!(once::Once, f::F) where {F <: Function} #TODO: Do we want this specialization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@inline function call_once!(once::Once, f::F) where {F <: Function} #TODO: Do we want this specialization
@inline function call_once!(f::F, once::Once) where {F}

"""
mutable struct Once
@atomic done::Ptr{Nothing} # Ptr is used so it resets during precompilation
lock::Threads.Event #Ideally would be a Futex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this should be a Condition variable, since being an Event is redundant with @done

Suggested change
lock::Threads.Event #Ideally would be a Futex
lock::Threads.Condition

Copy link
Member Author

@gbaraldi gbaraldi Aug 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Condition actually do what we want here? Event is here serving just as something for threads to sleep in if they enter the slow path. Specifically I want something that if somehow enters the slow path after the function has happened to go right through it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to call it with @lock cond while !once.done; wait(cond); end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. But does it allow anyone to go through once it's being notified? Because that's the behaviour I want that Event has.

@giordano giordano added the multithreading Base.Threads and related functionality label Aug 9, 2024
@nsajko nsajko added the feature Indicates new feature / enhancement requests label Aug 9, 2024
@nsajko
Copy link
Contributor

nsajko commented Aug 9, 2024

Fixes #54042

@gbaraldi
Copy link
Member Author

One important design question still left is. What happens if we error here. IMO we should set a flag that this failed and error. The flag would also make anyone else that calls this also error. Hopefully avoiding executing code in a bad state.

@quinnj
Copy link
Member

quinnj commented Aug 10, 2024

One important design question still left is. What happens if we error here. IMO we should set a flag that this failed and error. The flag would also make anyone else that calls this also error. Hopefully avoiding executing code in a bad state.

Perhaps we actually store the error? So anybody who comes along later can be thrown the same error instead of just a generic thing?

This functionality requires at least Julia 1.12
"""
mutable struct Once
@atomic done::Ptr{Nothing} # Ptr is used so it resets during precompilation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off. Does this need to be the semantics? Remember precompilation is an optimization, normally Julia would execute the top-level code in the same session and precompilation is just "in the infinentisimal time window before init".

Now the argument for it resetting during precompilation would be that I am guarding a data structure that is being reset as well, but one might be making the opposing statement as well.

I kinda want to discourage people from running their precompilation workload inside the module while it is still open, rather we should run it outside the module and save the state as if the precompilation workload hasn't run yet (Nathan opened a PR, iirc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there's two use cases here. One where I think this should just error if hit during precompilation. But you might be running a precompile workload that needs a library to be loaded/initialized and the only thing that would have the right semantics here is Ptr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one might reasonably want both behavior Once and OncePtr. The challenge is that currently we don't know what the user is referring to. This kinda makes me wonder if we want this to be a ref to the data produces by F

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the scenario I am thinking about is:

mutable struct Cache
    once::Once
    data
end

When Cache is being serialized it depends on what data is if I want to reset once or not. If data does not contain foreign pointers then once ought to not be reset.

Precompilation is equivalent to top-level execution in the target process, but we cache the effect. So IMO Once ought to serialized like any other Julia struct and have a reset function that can be used to reset it at the end of precompilation if it is guarding a foreign pointer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 that this is the really semantically trick part

When Cache is being serialized it depends on what data is if I want to reset once or not. If data does not contain foreign pointers then once ought to not be reset.

I think the rule is more complicated than that... In many cases Once is intended to replace __init__ functionality, in which case you might have cache sizes, local library paths, etc. which are selected - not just foreign pointers

None of those are supposed to affect the outcome of the computations at pre-compilation (even though they are generally different than what you'd compute with at run-time) so they are "sound" to compute twice, but you have to re-compute them again after pre-compilation or they're no longer valid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think have a reset! for Once would make sense so that the user can choose their behavior. It is IMO easier to explain what to do when your once computation needs to be to reset due to being run during precompilation, than to tell someone "it ran twice since we didn't think you would notice the first time"

But that is also the reason why __init__ is a separate thing.

Copy link
Member

@topolarity topolarity Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Yeah, that's reasonable

Although, where would you put the reset! for a sysimage workflow? It'd have to go after all of your dependers in the sysimage, right? It's also a bit awkward that we already do some pretty terrible mutation tearing, since in the pkgimage workflow you get an automatic "reset" from us dropping mutations on the floor if you are run during pre-compilation for another package.

Oh, and the sysimage workflow is unfortunately more important now than before, since it's required for #55047.

It'd be ideal for Once to support both workflows, esp. since __init__ really does not (it's very easy to call dependent code during a sysimage build that is not initialized).

vtjnash added a commit that referenced this pull request Aug 14, 2024
ThreadSynchronizer is only for things that are very trivial, as there
are a lot of things they are forbidden from doing (such as waiting for a
Task to set it).

Happened to notice while reviewing
#55439 (review)
that this was still using the pre-v1.2 style lock, which makes this
mostly useless in v1.4+
lazarusA pushed a commit to lazarusA/julia that referenced this pull request Aug 17, 2024
ThreadSynchronizer is only for things that are very trivial, as there
are a lot of things they are forbidden from doing (such as waiting for a
Task to set it).

Happened to notice while reviewing
JuliaLang#55439 (review)
that this was still using the pre-v1.2 style lock, which makes this
mostly useless in v1.4+
KristofferC pushed a commit that referenced this pull request Aug 19, 2024
ThreadSynchronizer is only for things that are very trivial, as there
are a lot of things they are forbidden from doing (such as waiting for a
Task to set it).

Happened to notice while reviewing
#55439 (review)
that this was still using the pre-v1.2 style lock, which makes this
mostly useless in v1.4+

(cherry picked from commit 2a4e2b1)
@atomic x::Int
end
const cnt = MyRandomCounter(0)
const once = Once()
Copy link
Member

@vtjnash vtjnash Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some API sketch ideas, via macros:

function foo()
    @once global hasrun1 = true
    @lazycall :per_thread global hasrun2 = true
    @assert hasrun && hasrun2
end

Or via global objects:

const startup1 = LazyCall(:onlyonce) do # errors if runs during precompile
    global hasrun1 = true
end
const startup2 = LazyCall(:concurrently) do # may run on all threads concurrently, as long as none have finished
    startup()
    global hasrun2
    @atomic @__MODULE__.hasrun2 = true
end
const startup3 = LazyCall(:per_process) do # runs once per process, reset on serialization
    startup2()
    global hasrun3 = true
end
global hasrun4 = Lockable(Dict{Int,Bool}())
const startup4 = LazyCall(:per_thread, () -> @lock hasrun4 empty!(hasrun4[])) do tid # runs once per thread, runs the reset on serialization
    startup3()
    @lock hasrun4 hasrun4[][tid]  = true
end
global hasrun5 = Lockable(WeakRefDict{Task,Bool}())
const startup5 = LazyCall(:per_task) do
    startup3()
    @lock hasrun5 hasrun5[][current_task()]  = true
    finalize(current_task()) do t
        @lock hasrun5 pop!(hasrun5[], t)
    end
    nothing
end
function foo()
    startup4()
    @assert hasrun && hasrun2 && hasrun3
    @assert @lock hasrun4 hasrun4[][Threads.threadid()]
    @assert @lock hasrun5 hasrun5[][current_task()]
end

Very minimal implementation sketch:

const Lock = ReentrantLock
mutable struct LazyCall
    @atomic hasrun::Ptr{Cvoid}
    const allow_compile_time::Bool
    const initializer
    const lock::Lock
    LazyCall(initializer; allow_compile_time::Bool) = new(C_NULL, allow_compile_time, initializer, Lock())
end
function (once::LazyCall)()
    (@atomic :acquire once.hasrun) != C_NULL && return
    (@noinline function()
        Base.__precompile__(once.allow_compile_time)
        @lock once.lock begin
            (@atomic :acquire once.hasrun) != C_NULL && return
            once.initializer()
            (@atomic :release once.hasrun = C_NULL + 1)
            nothing
        end
     end)()
     nothing
end

@gbaraldi gbaraldi closed this Nov 5, 2024
@maleadt
Copy link
Member

maleadt commented Nov 6, 2024

Superseded by #55793

@maleadt maleadt deleted the gb/once branch November 6, 2024 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates new feature / enhancement requests multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants