Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade threads from experimental to stable-with-caveats #35686

Merged
merged 2 commits into from
May 5, 2020

Conversation

JeffBezanson
Copy link
Member

Any more warnings we should list?

@JeffBezanson JeffBezanson added docs This change adds or pertains to documentation multithreading Base.Threads and related functionality labels May 1, 2020
@ViralBShah
Copy link
Member

Darn the failures look like they are from the old doc PRs. I'm looking at fixing them.

Comment on lines +232 to +235
* After a task starts running on a certain thread (e.g. via `@spawn`), it
will always be restarted on the same thread after blocking. In the future
this limitation will be removed, and tasks will migrate between threads.
Copy link
Member

@tkf tkf May 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you avoid thread migration by @spawn @async f()? If so, maybe mention that this is the forward-compatible way to rely on this property? (Not sure if we should be encouraging this pattern, though.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, curious. Probably not—instead seems like @async should inherit the sticky bit from the parent in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A function using @async internally may be called via @spawn. If @async inherits the sticky bit, this internal use of @async becomes actually equivalent to @spawn, right?

I think it's possible to use @async in a way concurrent-safe but thread-unsafe. If so, isn't it dangerous to inherit the sticky bit?

@tkf
Copy link
Member

tkf commented May 1, 2020

I had hoped that we get structured concurrency #33248 before @spawn becomes stable...

@JeffBezanson
Copy link
Member Author

The purpose here is to define what "stable" means in this context. It obviously can't mean that we will never change or add anything ever again. @spawn is a primitive, and we want to say that you can use it, and we will not delete it in 1.x. Is it possible to document what it means in a way that is useful but allows the changes you want? For example, would we entirely disallow @spawn with no @sync? If so, I really think the value of that is dubious. To me it seems like we have "goto" now, and might add "while" loops in the future. What's the problem?

@JeffBezanson
Copy link
Member Author

Also, I believe structured concurrency applies as soon as you have Tasks. The only thing @spawn adds is that Tasks can run on multiple threads, and structured concurrency is not about threads (i.e. it makes sense even on 1 thread). We've had Tasks since v0.1, so if you need structured concurrency before Tasks we are definitely a bit late.

@tkf
Copy link
Member

tkf commented May 2, 2020

Sorry, my comment was a bit too rant-y and not good for starting a constructive discussion.

I hate to say this, but, realistically speaking, I don't think introducing structured concurrency with/before stabilization of @spawn is doable or appropriate. I think there is simply too much work to do to introduce structured concurrency even in a non-strict form. Also, we can't have a strict form of structured concurrency without breaking @async/schedule anyway.

I'll reply to the technical parts of your comments in #33248.

See also [Synchronization](@ref lib-task-sync).

## Atomic operations

Copy link
Member

@vtjnash vtjnash May 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should mark these as subject to probable change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Do you think we'd need to delete them entirely, or just add a better interface?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, just don’t want to commit exactly to these yet

## Caveats

At this time, most operations in the Julia runtime and standard libraries
are thread safe.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
are thread safe.
are thread safe, except for `eval`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we’re going to specifically call out one thing here, I think it should be the data structures. eval is covered by the top-level bullet below (but could probably be specifically called out).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the idea is for the subsequent list to list all the exceptions to this. I called out include because that will generally involve many kinds of top-level expressions and so probably isn't safe, but I don't think it's useful to say that eval is unsafe. Is eval(:(1+2)) unsafe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea generally is that if it cant be done without eval, it’s unsafe right now. While that could have been a closure or something similar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's a clearer rule. Maybe we could be even more specific and list some things like method definitions, global assignments, ... ?

running in other threads that are allocating memory. In these cases it may
be necessary to insert a manual call to `GC.safepoint()` to allow GC to run.
This limitation will be removed in the future.
* Avoid using finalizers in conjunction with threads, particularly if they
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This item feels like it sticks out in the list as being unclear and unreasonable to state. I think we might need to be more explicit here about that finalizers get run simultaneously with code (kind of always have). I can work on some wording and sample code if that helps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be good, thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Avoid using finalizers in conjunction with threads, particularly if they
* Be aware that some library's finalizers may sometimes break badly if
threads are enabled (though likely are already broken). This may require
some transitional work across the ecosystem before threading can be widely
adopted with confidence. Further details on this are below.
## Safe use of Finalizers

Because finalizers can interrupt any code, they must be very careful in how
they interact with any global state. Unfortunately, the main reason that
finalizers are used is to update global state (a pure function is generally
rather pointless as a finalizer). This leads us to a bit of a tricky conundrum.
But one which has a few approaches to dealing with:

1. When singled-threaded, code could call access the internal
`jl_gc_enable_finalizers` function to prevent finalizers from being scheduled
inside a critical region. Internally, this is used inside some functions (such
as our C locks) to prevent recursion when doing certain operations (incremental
package loading, codegen, etc.). Observe that this combination (of an
inefficient spinlock and setting this flag) can be used to make finalizers safe
to work.

2. A second strategy, employed by Base in a couple places, is to explicitly try
to acquire the lock non-recursively. For example, `WeakKeyDict` and
`Distributed.finalize_ref` take this approach (though both only handle the
single-threading case correctly). This example below is updated to work in the
multi-threaded case:

        function finalize_ref(r::AbstractRemoteRef)
            if r.where > 0 # Check if the finalizer is already run
                if islocked(client_refs) || !trylock(client_refs)
                    # delay finalizer for later if we aren't free to acquire the lock
                    finalizer(finalize_ref, r)
                    return nothing
                end
                try # `lock` should always be followed by `try` in correct code
                    if r.where > 0 # Must check again here
                        # Do actual cleanup here
                        r.where = 0
                    end
                finally
                    unlock(client_refs)
                end
            end
            nothing
        end

3. A related third strategy is to use a yield-free queue. We don't currently
have a lock-free queue implemented in base, but
`Base.InvasiveLinkedListSynchronized{T}` is suitable. This can frequently be a
good strategy to use for code with event loops. For example, this strategy is
employed by `Gtk.jl` to manage lifetime ref-counting. In this approach, we
don't do any explicit work inside the `finalizer`, and merely add it to a queue
to be dealt with at a safer time. Actually, `schedule` already uses this, so
defining the finalizer as `x -> @spawn do_cleanup(x)` is one example of this
approach. Note however that this doesn't control which thread `do_cleanup` is
running on here, so `do_cleanup` would still need to acquire a lock. That
doesn't need to be true if you implement your own queue, as you can explicitly
only drain that queue from your thread. Additionally, instead of the `@spawn`
convenience macro, you could manually schedule it on a specific thread:

        t = @task do_cleanup(x)
        ccall(:jl_set_task_tid, Cvoid, (Any, Cint), t, 0) # make sticky to thread with tid=1
        schedule(t)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's excellent, thanks. I'm not sure we want to mention jl_set_task_tid, since then it might be considered "documented" and we will have to keep it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m okay with that, though we might want to make it look less like a hack

NEWS.md Outdated Show resolved Hide resolved
@JeffBezanson JeffBezanson merged commit d07fadf into master May 5, 2020
@JeffBezanson JeffBezanson deleted the jb/threadstatus branch May 5, 2020 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants