Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scoped threads to the standard library #2647

Closed
wants to merge 7 commits into from
Closed

Add scoped threads to the standard library #2647

wants to merge 7 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Feb 26, 2019

Rendered

Add scoped threads to the standard library that allow one to spawn threads borrowing variables from the parent thread. Example:

let var = String::from("foo");

thread::scope(|s| {
    s.spawn(|_| println!("borrowed from thread #1: {}", var));
    s.spawn(|_| println!("borrowed from thread #2: {}", var));
})
.unwrap();

@pitdicker
Copy link
Contributor

Scoped threads in Crossbeam have matured through years of experience and today we have a design that feels solid enough

I am a little curious about this. Can you say something about the history of scoped threads in Crossbeam? In what way did it change to mature over the years?

@ghost
Copy link
Author

ghost commented Feb 26, 2019

@pitdicker Sure!

There are two designs scoped threads went through. The old one is from the time thread::scoped() got removed and we wanted a sound alternative for the Rust 1.0 era. The new one is from the last year's big revamp.

Old: https://docs.rs/crossbeam/0.2.12/crossbeam/fn.scope.html
New: https://docs.rs/crossbeam/0.7.1/crossbeam/fn.scope.html

There are several differences between the old and new scoped threads:

  1. scope() now returns a thread::Result<R> rather than R. This is because panics in the old design were just silently ignored, which is not good. By returning a Result, the user can handle panics in whatever way they want.

  2. The closure passed to Scope::spawn() now takes a &Scope<'env> argument that allows you to spawn nested threads, which was not possible with the old design. Rayon similarly passes a reference to child tasks.

  3. We removed Scope::defer() because it's not really useful, had bugs, and had a non-obvious behavior.

  4. Some soundness bugs around lifetimes got fixed. EDIT: Those could've just been bugs in the borrow check.

@jonas-schievink jonas-schievink added T-libs-api Relevant to the library API team, which will review and decide on the RFC. A-threads Threading related proposals & ideas labels Feb 26, 2019
@Ixrec
Copy link
Contributor

Ixrec commented Feb 26, 2019

As currently written, the motivation's definitely lacking for me. All the arguments given in the RFC could be made for any other widely used crate (except the historical precedent that this used to be in std, which imo isn't a compelling argument on its own). Unfortunately, I don't know anything about crossbeam, but I strongly suspect we have better arguments that we could put in the RFC.

The usual arguments I find compelling for putting something in std are:

  1. It needs a separate implementation for every platform, so providing an official, standard implementation makes the whole ecosystem more "portable by default".

This is why things like std::thread belong in std, which I hope is uncontroversial. But I have no idea if this also applies to scoped threads. Do crossbeam's scoped threads involve abstracting over any OS-specific primitives that regular threads don't?

  1. It cannot be implemented "properly" (whether that's for performance or correctness or ergonomic reasons) without using unsafe code.

I'm guessing this applies to crossbeam, but I really don't know. How much non-trivial unsafety is there is crossbeam's scoped threads implementation?

Another argument that's not always compelling but usually moves the needle for me is an analogy with stuff that's already in std. The fact that we recent-ish-ly adopted parking_lot impls of std's concurrency primitives seems like it might be analogous to adopting crossbeam for scoped threads, assuming crossbeam scoped threads are not abstracting over OS primitives and there simply are no OS primitives for scoped threads (the latter feels like a safe assumption since "scope" is not a thing at the OS level, but again I really don't know enough about crossbeam (or parking_lot) to tell how much of this would form a cogent argument).

@ghost
Copy link
Author

ghost commented Feb 26, 2019

@Ixrec Thank you for the thoughtful response!

Do crossbeam's scoped threads involve abstracting over any OS-specific primitives that regular threads don't?

They don't.

How much non-trivial unsafety is there is crossbeam's scoped threads implementation?

There's some crazy unsafety in there that was incredibly tricky to get right. Suffice to say that even after the leakpocalypse, Crossbeam had unsound scoped threads for three years! That's pretty non-trivial. :) EDIT: Perhaps that was a soundness hole in the borrow checker?

Another argument that's not always compelling but usually moves the needle for me is an analogy with stuff that's already in std.

I see the lack of scoped threads more as a gaping hole in std. If one of Rust's selling points is that it erases the boundary between borrowing and threads, why does spawn() still require F: 'static?

We even have this unstable and unsafe std::thread::Builder::spawn_unchecked() function, which is supposed to ease the pain of building scoped thread abstractions. But we could instead have a simple and proper abstraction in std... shrug

@danielhenrymantilla
Copy link

danielhenrymantilla commented Feb 26, 2019

I agree that the fact the default example of concurrency in Rust, the language that maximizes zero-cost abstractions, uses Arc to share data around, does look surprising and disappointing to me. That's why I think scoped threads deserve a special treatment that maybe other popular crates don't.

@Centril
Copy link
Contributor

Centril commented Feb 26, 2019

There's some crazy unsafety in there that was incredibly tricky to get right. Suffice to say that even after the leakpocalypse, Crossbeam had unsound scoped threads for three years! That's pretty non-trivial. :)

@stjepang Given this, my only comment is that I would like there to be an equal crazy amount of comments justifying the usage of every unsafe impl and unsafe { ... } in the code that we add. Also, if @RalfJung and/or @jhjourdan et al. can have a look at the implementation that would be nice.

@mark-i-m
Copy link
Member

Personally, I've wanted scoped threads a few times, and having them in libstd seems appropriate to me. I agree that restricting threads to 'static feels oddly underpowered and restrictive given the awesomeness and flexibility of the rest of libstd.

That said, I also don't mind adding crossbeam as a dependency when I need this. Often enough, I'm already using crossbeam for something else in the same project.

Also, @stjepang thanks for all your work on crossbeam! I'm continually impressed by the high quality of the concurrency primitives in there... one of my favorite crates.

@Diggsey
Copy link
Contributor

Diggsey commented Feb 27, 2019

Presumably if a scoped thread returns a value, but is never joined, the return value is dropped at the end of the scope? Or is it dropped when the scoped thread handle is dropped? Might be worth specifying that just for completeness.

Motivation-wise, the fact that it improves the basic concurrency examples so much is the argument I find most convincing for it to be in std. Especially with how important those examples are for showing off the power of the language.

@matklad
Copy link
Member

matklad commented Feb 27, 2019

Another reason to have this in std is that scoped threads are arguably a better default than non-scoped threads. With std::Thread, it is possible (I did this in a blog post 😭) to forget to call .join, and that is problematic because then the thread might be abruptly terminated when main thread exits, which prevents drops from occurring. Scoped threads guarantee that the thread is joined.

EDIT: obligatory link to the structured concurrency blog post.

Crossbeam has had scoped threads since Rust 1.0.

I believe this was something like 1.2 though :) crossbeam-rs/crossbeam@c0714a5

@RalfJung
Copy link
Member

There's some crazy unsafety in there that was incredibly tricky to get right. Suffice to say that even after the leakpocalypse, Crossbeam had unsound scoped threads for three years! That's pretty non-trivial. :)

Seems I forgot about that or missed it... do you have a pointer for the bug that was only discovered after 3 years?

@RalfJung
Copy link
Member

The closure passed to Scope::spawn() now takes a &Scope<'env> argument that allows you to spawn nested threads, which was not possible with the old design. Rayon similarly passes a reference to child tasks.

I am a bit surprised that the thread's closure cannot just use the outer spawn handle. I suppose that is because the closure must outlive 'env but we only have the handle for the anonymous lifetime in &'_ Scope<'env>.

That makes me wonder why you let the user choose 'env. The user might pick a 'env that is bigger than what one might expect, i.e., I could get a &Scope<'static>. There's nothing unsound about that from what I can see (it just means that all spawned closures must outlive 'static), but it does mean that your diagram

'variables_outside: 'env: 'scope: 'variables_inside

is somewhat misleading. 'env can actually be arbitrarily bigger than what is indicated here.

What I would have expected is something more like

fn scope<F, T>(f: F) -> Result<T>
where
    F: for<'env> FnOnce(&'env Scope<'env>) -> T;

impl<'env> Scope<'env> {
    fn spawn<'scope, F, T>(&'scope self, f: F) -> ScopedJoinHandle<'scope, T>
    where
        F: FnOnce() -> T + Send + 'env,
        T: Send + 'env;
}

Wouldn't that also let the spawned thread use the outer Scope, because now it would outlive 'env? In fact at this point one should probably have some kind of type alias for &'env Scope<'env> to avoid the repetition. I suppose that doesn't work but I'd like to see more documentation about why.

@ghost
Copy link
Author

ghost commented Feb 27, 2019

Replying to @Centril, @Diggsey, and @RalfJung

Given this, my only comment is that I would like there to be an equal crazy amount of comments justifying the usage of every unsafe impl and unsafe { ... } in the code that we add.

Absolutely! We have plenty of comments already, but could be even more thorough 👍

Presumably if a scoped thread returns a value, but is never joined, the return value is dropped at the end of the scope? Or is it dropped when the scoped thread handle is dropped?

The return value cannot be dropped when the handle is dropped because at that point the thread may not be finished yet. :) So it gets dropped whenever the thread is joined, which could be automatically at the end of the scope.

Seems I forgot about that or missed it... do you have a pointer for the bug that was only discovered after 3 years?

It's here: crossbeam-rs/crossbeam-utils#36

The issue is that we need to paremetrize ScopedJoinHandle over 'scope so that it doesn't escape the scope.

Reproducible example:

fn main() {
    let mut greeting = "Hello world!".to_string();
    let res = crossbeam::scope(|s| s.spawn(|| &greeting)).join();

    greeting = "DEALLOCATED".to_string();
    drop(greeting);

    println!("thread result: {:?}", res);
}

Output: thread result: pjQ7VTED

However, today the compiler emits warnings, so maybe that was just an issue with the old borrowck and we didn't know back then:

  = warning: this error has been downgraded to a warning for backwards compatibility with previous releases
  = warning: this represents potential undefined behavior in your code and this warning will become a hard error in the future

Now I have a feeling we probably don't need to parametrize ScopedJoinHandle over 'scope anyway and might even use JoinHandle instead of ScopedJoinHandle, making the API even smaller!

I am a bit surprised that the thread's closure cannot just use the outer spawn handle. I suppose that is because the closure must outlive 'env but we only have the handle for the anonymous lifetime in &'_ Scope<'env>.

Yes, you got that right.

The user might pick a 'env that is bigger than what one might expect, i.e., I could get a &Scope<'static>.

Correct.

I believe the problem with

fn scope<F, T>(f: F) -> Result<T>
where
    F: for<'env> FnOnce(&'env Scope<'env>) -> T;

impl<'env> Scope<'env> {
    fn spawn<'scope, F, T>(&'scope self, f: F) -> ScopedJoinHandle<'scope, T>
    where
        F: FnOnce() -> T + Send + 'env,
        T: Send + 'env;
}

is that, while in theory it should work, the borrowck only does local reasoning and cannot infer all the relationships between lifetimes.

More concretely, in a spawn() invocation, 'env could be literally anything and the borrowck cannot figure out whether F: 'env is true. If you try scoped threads with this signature on a simple example, it won't compile:

error[E0373]: closure may outlive the current function, but it borrows `*counter`, which is owned by the current function
  --> crossbeam-utils/tests/thread.rs:17:34
   |
17 |         let handle = scope.spawn(|| {
   |                                  ^^ may outlive borrowed value `*counter`
18 |             counter.store(1, Ordering::Relaxed);
   |             ------- `*counter` is borrowed here
help: to force the closure to take ownership of `*counter` (and any other referenced variables), use the `move` keyword
   |
17 |         let handle = scope.spawn(move || {
   |                                  ^^^^^^^

But if we introduce a new lifetime with fn scope<'env, F, T>, then borrowck knows that 'env is a concrete lifetime that outlives this scope() invocation. I'm not an expert at borrow checking, so apologies for the handwavy explanation.

@RalfJung
Copy link
Member

Reproducible example:

So, what this does is it joins after the scope has ended. And why is that a problem? Is it because the scope has already joined, so now we are double-joining?

Output: thread result: pjQ7VTED

However, today the compiler emits warnings, so maybe that was just an issue with the old borrowck and we didn't know back then:

Oh, so it forgets that greeting is borrowed while res is live? Yeah that looks like a rustc bug. Seems that can lead to UB even in safe code.

However I have not managed to actually get the warning you were talking about, can you reproduce that in a self-contained example?

More concretely, in a spawn() invocation, 'env could be literally anything and the borrowck cannot figure out whether F: 'env is true. If you try scoped threads with this signature on a simple example, it won't compile:

Oh, good point. With 'env quantified like that, it is our choice as the implementer of spawn which lifetime we use there -- and there's no upper bound, so nothing will outlive that lifetime.

Looks like indeed quantifying it the way you did is expressing the right thing. Ideally we would communicate to the borrow checker that 'env is equal to the lifetime of the scope closure, but that does not seem possible.

@RalfJung
Copy link
Member

Oh, turns out my code does show a migration warning, but playground doesn't show it because there's non-UTF-8 data.

@stjepang but you said you had an example that would actually compile with Rust 2015? Mine does not.

@ghost
Copy link
Author

ghost commented Feb 27, 2019

@RalfJung

So, what this does is it joins after the scope has ended. And why is that a problem? Is it because the scope has already joined, so now we are double-joining?

This is not a problem if we ensure the scope joins the thread and saves the returned result in an Arc<Mutex<_>> and join_handle.join() simply extracts the saved result. Do you agree?

I honestly don't remember anymore why I wanted to prevent join handles from escaping the scope - maybe UB was possible in an older rustc version because the borrowck had a soundness hole? Or maybe I was just overly cautious and conservatively made ScopedJoinHandle generic over 'scope.

Oh, turns out my code does show a migration warning, but playground doesn't show it because there's non-UTF-8 data.

Haha, that's the magic of UB! ✨ If you click "..." next to the "RUN" button and choose "Build", the playground will show warnings.

@ghost
Copy link
Author

ghost commented Feb 27, 2019

All right, I've merged JoinHandle and ScopedJoinHandle together, which makes the API smaller and simplifies the lifetime soup. Also expanded the section on rationale and drawbacks with some additional motivation.

@anp
Copy link
Member

anp commented Feb 27, 2019

@pitdicker Sure!

There are two designs scoped threads went through. The old one is from the time thread::scoped() got removed and we wanted a sound alternative for the Rust 1.0 era. The new one is from the last year's big revamp.

Old: https://docs.rs/crossbeam/0.2.12/crossbeam/fn.scope.html
New: https://docs.rs/crossbeam/0.7.1/crossbeam/fn.scope.html

There are several differences between the old and new scoped threads:

  1. scope() now returns a thread::Result<R> rather than R. This is because panics in the old design were just silently ignored, which is not good. By returning a Result, the user can handle panics in whatever way they want.
  2. The closure passed to Scope::spawn() now takes a &Scope<'env> argument that allows you to spawn nested threads, which was not possible with the old design. Rayon similarly passes a reference to child tasks.
  3. We removed Scope::defer() because it's not really useful, had bugs, and had a non-obvious behavior.
  4. Some soundness bugs around lifetimes got fixed.

This is a really interesting and useful summary! Would it be reasonable to include this in the prior art section? Having an official doc summarizing the path taken here would be awesome.

@Lokathor
Copy link
Contributor

Neat, but I deeply object to the "None" under Future Possibilities: The obvious next step is to have a ScopedThreadPool be put into std on top of this.

@oliver-giersch
Copy link

It's been a while since I've had anything to do with scoped threads, so I implemented a minimal version without nested spawns the way I remembered things:

https://gist.github.com/oliver-giersch/8878d769e47bd97b96aa6833f01d91eb

I don't see how you could use the regular JoinHandle type for scoped threads. What exactly prevents join handles from being leaked outside of the closure and how would you express the fact, that either the user or the scope function can call join on the handle?

@ghost
Copy link
Author

ghost commented Mar 2, 2019

I don't see how you could use the regular JoinHandle type for scoped threads

A JoinHandle could be represented as:

pub struct JoinHandle<T>(Inner<T>);

enum Inner<T> {
    NormalHandle(/* ... */),
    ScopedHandle(/* ... */),
}

What exactly prevents join handles from being leaked outside of the closure

There's nothing wrong with leaking join handles outside of the closure as long as the thread is joined before the scope is exited.

how would you express the fact, that either the user or the scope function can call join on the handle?

Both JoinHandle and scope() would hold a reference to the actual join handle Arc<Mutex<ActualJoinHandle<()>> and an Arc<Mutex<io::Result<T>>> that gets filled with the join result. The thread will be joined either when JoinHandle is manually joined or when scope() exits, whatever happens first.

Note that join_handle.join() will join the thread if called inside the scope, or return the result saved by scope() if called outside the scope.

@oliver-giersch
Copy link

oliver-giersch commented Mar 3, 2019

I see, I wasn't aware that this RFC proposes to change to current structure/implementation of the regular JoinHandle, maybe this part could be made a little clearer.

However, I am doubtful that this could work without breaking the current API. Currently, JoinHandle has two methods:

pub fn thread(&self) -> &Thread { ... }
pub fn join(self) -> Result<T, Box<dyn Any + Send + 'static>> { ... }

I am not certain a 'leaked' scoped join handle could return a &Thread reference, after its thread has already been joined, for instance. The return type would have to be changed to Option<&Thread> I suppose.
Another problematic case is the Err case for the join result. I came across this when I tried implementing something like proposed change, even when using separate ScopedJoinHandle type.
Consider the case of a panicking scoped thread and a leaked join handle:
At the end of the scope, the 'inner' join handle is definitely joined and an Err(Box<Any + Send + 'static>) is returned due to the panic. This boxed error is moved into a Vec with potential other errors and the vector returned by the scope function.
When join() is called on a leaked scoped join handle, it can not return the same boxed error, since it has already been returned by the scope function.

I suppose it would be possible to split the Result<T, Box<Any + Send + 'static>> into two separate results, one to be returned by the scope function (with Ok(()) in case of a successful join) and one to be returned by the join handle (with Err(())), but this would also be incompatible with the current API.

EDIT
Perhaps it would be best to let the scope function to return a different error in this case and allow the 'leaked' (scoped) join handle to return the actual error.

@bjorn3
Copy link
Member

bjorn3 commented Mar 3, 2019

I am not certain a 'leaked' scoped join handle could return a &Thread reference, after its thread has already been joined, for instance. The return type would have to be changed to Option<&Thread> I suppose.

A thread could already have exited before calling .thread() on a JoinHandle. As far as I know that doesn't crash or panic either.

@oliver-giersch
Copy link

I am not certain a 'leaked' scoped join handle could return a &Thread reference, after its thread has already been joined, for instance. The return type would have to be changed to Option<&Thread> I suppose.

A thread could already have exited before calling .thread() on a JoinHandle. As far as I know that doesn't crash or panic either.

You could be correct, I am not sure. However, there may be a difference between a thread having already exited and it being explicitly joined. This is probably the only (potential) issue, that might impede using JoinHandle for scoped threads, the other point I've brought up is likely an implementation detail that could be solved later.

@ghost
Copy link
Author

ghost commented Mar 3, 2019

@oliver-giersch I believe you're right in that it's not possible to extract the panic (the Err result returned by join()) from a JoinHandle that has escaped the scope. For that reason, I've reverted the last change and am now convinced it's best not to allow join handles to escape the scope.

Alternatively, we could change behavior of scope() to return a T rather than Result<T> so that the scope only waits for threads to finish rather than joins threads. However, with that API, one would have to manually join all threads and unwrap the results to propagate panics into the parent thread, which is less ergonomic and more prone to errors.

@oliver-giersch
Copy link

@oliver-giersch I believe you're right in that it's not possible to extract the panic (the Err result returned by join()) from a JoinHandle that has escaped the scope. For that reason, I've reverted the last change and am now convinced it's best not to allow join handles to escape the scope.

Alternatively, we could change behavior of scope() to return a T rather than Result<T> so that the scope only waits for threads to finish rather than joins threads. However, with that API, one would have to manually join all threads and unwrap the results to propagate panics into the parent thread, which is less ergonomic and more prone to errors.

I do not know how relevant this implementation stuff is for the RFC at this point, so reign me in if this goes to far at this point ;)
It is possible to extract the error, but it is rather complicated and not necessarily obvious behaviour.
When all threads are joined at the end of the scope, e.g. Arc::strong_count can be used to check if there exists a leaked join handle. If so, the result can be stored so the leaked handle can extract it later and a boxed dummy error returned by the scope.

The overall benefit from being able to leak join handles soundly does not outweigh this, so I'd definitely agree with scrapping this idea and keeping a dedicated and unleakable ScopedJoinHandle<'scope, T>.

@oliver-giersch
Copy link

I would also suggest changing the return type of the scope function to Result<T, Box<[Box<dyn Any + Send + 'static>]>>, since I don't think there is a good reason (semantically) why this function should return a thread::Result. The return value of the scope itself is not necessarily thread related and can in fact be almost anything (the only bound is T: 'env).
Also, the alternative signature makes it clearer that actually the collected failure results of (potentially) several threads are returned. With Box<Any> you need to know (from reading the documentation), which type is actually returned and then make the correct downcast (which is unergonomic). I'd assume most people would at most want to know the number of panicked threads (i.e. the length of the vector/slice) and returning a boxed slice allows you to get this value easily. Using a boxed slice also communicates intent of the type better, since there is no reason why anybody would want to e.g. add more elements to a vector of returned errors. Last but not least, it also allows us to avoid the awkward boxing of a Vec just to shoehorn it into this specific type alias.

@bstrie
Copy link
Contributor

bstrie commented Jan 23, 2020

@stjepang Been a while since there's been any activity here, do you still feel good about this RFC? If so, perhaps we could consider pinging the libs team to proceed to the next steps.

@ghost
Copy link
Author

ghost commented Jan 23, 2020

@bstrie I still feel good about it -- let's ping the libs team.

@blkerby
Copy link

blkerby commented Jun 19, 2020

This is echoing some earlier arguments in favor of scoped threads in std, but I've noticed that the documentation for std::sync module is filled with examples involving Arc (in PoisonError, Mutex, WaitTimeoutResult, Barrier, RwLock, and Condvar), and if I'm not mistaken, every one of these uses of Arc is unnecessary and could be eliminated if scoped threads were available. A similar thing could be said for the way that Arc is used in introductory Rust materials outside of the docs. The current situation gives new Rust users a misleading impression that Arc is indispensible in almost every multithreaded scenario. There's also an example in std::atomic where an AtomicUsize is put behind an Arc in order to share across threads; this one is amusing because in order to use an AtomicUsize one has to create two more AtomicUsizes inside the Arc, which of course (with scoped threads) isn't needed at all.

@steveklabnik
Copy link
Member

I would like to register strong support here. I can't count the number of examples and documentation that would make use of scoped threads, and be way better for it.

Furthermore, at least in the kind of code that I write, often, scoped threads are what I actually want, rather than thread::spawn. thread::spawn's signature is more general, but that means that it can be used in less situations, and I rarely need the power of thread::spawn.

@richardjharris
Copy link

As a beginner to Rust, I spent a fair amount of time implementing a TCP daemon with multiple threads using standard threading (as covered in the "The Rust Programming Language" book). But actually the workers need to live and die within the scope of a single run method. Had I known I could use scoped threads earlier, I would have implemented it that way, and not had to wrap everything in Arc.

If it doesn't end up in the standard library, I think it would be nice for the docs to at least mention that scoped threads are a thing.

@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Jul 29, 2020
@yaahc
Copy link
Member

yaahc commented Jun 9, 2021

We're likely going to discuss this RFC in today's libs team meeting. To prepare I want to summarize the existing comments to give an idea of the issues and questions that have been raised so far.

  • comment: does an API like this belong in std and more generally what is our metric for determining what belongs in std?
  • comment: we need to ensure the unsafe code in the implementation code is carefully audited
  • comments [1] [2]: Ideally we'd be able to use the outer spawn handle from inner scopes but we're unclear if we can, and it would require new language / borrow checker features
  • comment: there's no particular reason we should restrict ourselves to using std::thread::Result as the return type

@danielhenrymantilla
Copy link

danielhenrymantilla commented Jun 9, 2021

FWIW, I've looked into that third bullet point of yours, @yaahc, and I believe I've been able to come with a sound design that allows to implicitly capture scope. The trick was twofold:

  • bind the inner closure with 'scope rather than 'env (this allows them to capture scope (issue that [1] attempted to tackle), although it does now require the closure to be move);

  • but in order to be sound we then need to prevent 'scope from shrinking, which scope: &'scope …<'env> does not achieve. So I made the following changes:

    • Scope<'env> -> Inner<'env>,

    • and &'scope Inner<'env> -> Scope<'scope, 'env>,
      with this new Scope being a Copy wrapper around &Inner which makes 'scope invariant.

At that point the outer closure (the one handling the whole scope) takes a Scope parameter with a higher order 'scope lifetime (but one which is still upper-bounded by 'env, this avoids the issues of the comment [2]).

  • Playground

  • Some extra remarks

    'env is thus still caller-chosen, but this is fine, the idea is that an unspecified 'env will "lead the borrowck to make 'env as small as possible in order to maximize flexibility", that is, 'env will exactly span around the scope(…) call and nothing more (nor less!). 'scope is higher-order (w.r.t. F), so F is necessarily unable to capture or return anything 'scope-infected whatsoever. But it is also, thanks to the invariance, guaranteed to be bigger than any span inside the provided closure (hence why it is sound to loosen the : 'env requirement on the inner threads down to 'scope).

    TL,DR / Summary

    'scope thus achieves the following two key properties:

    • w.r.t. inner closures, 'scope might be as big as 'env, so besides allowing them to capture the scope handle itself, they are as restricted as with an 'env bound.

    • w.r.t. the outer closure, it can't depend on 'scope, due to it being a higher-order lifetime, so the scope handles can't escape that closure.

      • w.r.t. the locals outside the scope(…) call that get captured by that outer closure, we do have that 'env ≥ 'scope bound that prevents the higher-order-ness of 'scope from messing things up.

So || inner closures are possible, but at the cost of requiring them to be move, which can be a bit annoying w.r.t. capturing 'env locals by reference. That being said, I do believe that explicit closure captures (move(scope) || { … }) could soothe that. But at that point, are we really worse than with |scope| { … }? 😄)

@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/libs meeting.

We have consensus that we want this in the standard library, for a variety of reasons, not least of which being able to write tests, examples, and similar.

We talked about panic handling behavior, and came to the consensus that we don't want thread::scope to return a Result; rather, we'd like it to just panic if any of the threads panic. We don't want to build a collection of panic results and provide mechanisms to walk that collection. If people want to handle panics in some way other than propagating them, they can call join on the handles, which provides flexibility on how precisely to handle those panics, in what order, and so on.

Since the original author of this RFC is no longer on GitHub, we'd appreciate it if someone would pick this up, update it based on the feedback in this comment, and re-submit it.

@nikomatsakis
Copy link
Contributor

I'm enthusiastic about the concept of scoped threads being in the standard library, but I want to add a concern: I'd like to make the same concepts work in async, or at least explore what that will require. It's been a busy couple of weeks, but that's been part of the work that I've been thinking might come out of the vision doc (and, if we do so, I was expecting to think also about the sync case, as I would like to always be moving in the direction of more analogous behavior).

@joshtriplett
Copy link
Member

I would love to see the same concept work for async. Is there any particular design constraint that would necessitate blocking one on the other, though? If the function was to return some special error type that collected multiple panics, it might make sense to make sure the design allows us to reuse that type for both sync and async. However, in the absence of that, is there any part of the design that would be shared between the two?

@danielhenrymantilla
Copy link

danielhenrymantilla commented Jun 11, 2021

The lifetimes and most of the API would be similar, but there is a huge detail in async which seems non-trivial: making sure the wait_group.wait() point is reached, or that proper cleanup is else ensured.

That is, consider:

fn main ()
{
     let ref a = (|| 42)(); // avoid static promotion
     let async_main = async {
         let ref b = (|| 27)();
         let scope_fut = task::scope(move |scope| async move {
             scope.spawn(async move { println!("{}, {}", *a, *b); }); // potentially in another thread
             fut_that_yields_pending_at_least_once().await;
             /* inserted by `task::scope`'s "runtime":
             ….spawned_tasks.await; // */
         });
         pin_mut!(scope_fut);
         ::futures::future::poll_fn(|cx| {
             // poll it until reaching a point where `scope.spawn(…)` has been called,
             // so that the subtask gets enqueued onto a background thread
             match scope_fut.as_mut().poll(cx) {}
             Poll::Ready(())
         }).await;
     });
     ::futures::executor::block_on(async_main);
}

Correct me if I am wrong, but I believe such code could suffer from a race condition, whereby the other thread may start spinning the async { println!("{}, {}", *a, *b); } task, and before trying to read the values of *a and *b, the async_main future completes and gets dropped (making b dangle), and then even *a gets dropped.

So, I simply fail to see a foolproof task::scope(…) API that would yield a future and that would be able to ensure a wait on all the spawned tasks.

On the other hand, we can imagine a futures::executor::block_on-like API, but yielding a scope handle with which to capture locals outside that block_on call. I guess that could already be quite useful:

fn main ()
{
     let ref a = (|| 42)();
     some::executor::scoped_block_on(move |scope| async move {
         let ref b = (|| 27)();
         let handle = scope.spawn(async move {
             // println!("{}, {}", *a, *b); /* Error, cannot refer to `b` */
             println!("{}", *a);
         });
         /* stuff */
         // optional:
         handle.await;
     });
}

But this means that the implementation of such a function is now tied to an executor 😕


All in all, I find the async variant way more subtle due to code paths not being guaranteed to be run, and thus agree with @joshlf that it may be better to keep the sync/threaded attempt and the async one untangled.

@bstrie
Copy link
Contributor

bstrie commented Jul 19, 2021

I've filed a fresh PR for scoped threads here: #3151

@liigo
Copy link
Contributor

liigo commented Jul 20, 2021

Since the original author of this RFC is no longer on GitHub

May be a bit off topic, but who knows what happened to Stjepan Glavina (stjepang), the original author of this RFC, Smol and many other great projects. He also deleted his twitter account a few months ago. I miss him.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-threads Threading related proposals & ideas Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.