-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scoped threads to the standard library #2647
Conversation
I am a little curious about this. Can you say something about the history of scoped threads in Crossbeam? In what way did it change to mature over the years? |
@pitdicker Sure! There are two designs scoped threads went through. The old one is from the time Old: https://docs.rs/crossbeam/0.2.12/crossbeam/fn.scope.html There are several differences between the old and new scoped threads:
|
As currently written, the motivation's definitely lacking for me. All the arguments given in the RFC could be made for any other widely used crate (except the historical precedent that this used to be in std, which imo isn't a compelling argument on its own). Unfortunately, I don't know anything about crossbeam, but I strongly suspect we have better arguments that we could put in the RFC. The usual arguments I find compelling for putting something in
This is why things like
I'm guessing this applies to crossbeam, but I really don't know. How much non-trivial unsafety is there is crossbeam's scoped threads implementation? Another argument that's not always compelling but usually moves the needle for me is an analogy with stuff that's already in |
@Ixrec Thank you for the thoughtful response!
They don't.
There's some crazy unsafety in there that was incredibly tricky to get right.
I see the lack of scoped threads more as a gaping hole in We even have this unstable and unsafe |
I agree that the fact the default example of concurrency in Rust, the language that maximizes zero-cost abstractions, uses |
@stjepang Given this, my only comment is that I would like there to be an equal crazy amount of comments justifying the usage of every |
Personally, I've wanted scoped threads a few times, and having them in That said, I also don't mind adding Also, @stjepang thanks for all your work on crossbeam! I'm continually impressed by the high quality of the concurrency primitives in there... one of my favorite crates. |
Presumably if a scoped thread returns a value, but is never joined, the return value is dropped at the end of the scope? Or is it dropped when the scoped thread handle is dropped? Might be worth specifying that just for completeness. Motivation-wise, the fact that it improves the basic concurrency examples so much is the argument I find most convincing for it to be in std. Especially with how important those examples are for showing off the power of the language. |
Another reason to have this in std is that scoped threads are arguably a better default than non-scoped threads. With EDIT: obligatory link to the structured concurrency blog post.
I believe this was something like 1.2 though :) crossbeam-rs/crossbeam@c0714a5 |
Seems I forgot about that or missed it... do you have a pointer for the bug that was only discovered after 3 years? |
I am a bit surprised that the thread's closure cannot just use the outer spawn handle. I suppose that is because the closure must outlive That makes me wonder why you let the user choose
is somewhat misleading. What I would have expected is something more like fn scope<F, T>(f: F) -> Result<T>
where
F: for<'env> FnOnce(&'env Scope<'env>) -> T;
impl<'env> Scope<'env> {
fn spawn<'scope, F, T>(&'scope self, f: F) -> ScopedJoinHandle<'scope, T>
where
F: FnOnce() -> T + Send + 'env,
T: Send + 'env;
} Wouldn't that also let the spawned thread use the outer |
Replying to @Centril, @Diggsey, and @RalfJung
Absolutely! We have plenty of comments already, but could be even more thorough 👍
The return value cannot be dropped when the handle is dropped because at that point the thread may not be finished yet. :) So it gets dropped whenever the thread is joined, which could be automatically at the end of the scope.
It's here: crossbeam-rs/crossbeam-utils#36 The issue is that we need to paremetrize Reproducible example: fn main() {
let mut greeting = "Hello world!".to_string();
let res = crossbeam::scope(|s| s.spawn(|| &greeting)).join();
greeting = "DEALLOCATED".to_string();
drop(greeting);
println!("thread result: {:?}", res);
} Output: However, today the compiler emits warnings, so maybe that was just an issue with the old borrowck and we didn't know back then:
Now I have a feeling we probably don't need to parametrize
Yes, you got that right.
Correct. I believe the problem with fn scope<F, T>(f: F) -> Result<T>
where
F: for<'env> FnOnce(&'env Scope<'env>) -> T;
impl<'env> Scope<'env> {
fn spawn<'scope, F, T>(&'scope self, f: F) -> ScopedJoinHandle<'scope, T>
where
F: FnOnce() -> T + Send + 'env,
T: Send + 'env;
} is that, while in theory it should work, the borrowck only does local reasoning and cannot infer all the relationships between lifetimes. More concretely, in a error[E0373]: closure may outlive the current function, but it borrows `*counter`, which is owned by the current function
--> crossbeam-utils/tests/thread.rs:17:34
|
17 | let handle = scope.spawn(|| {
| ^^ may outlive borrowed value `*counter`
18 | counter.store(1, Ordering::Relaxed);
| ------- `*counter` is borrowed here
help: to force the closure to take ownership of `*counter` (and any other referenced variables), use the `move` keyword
|
17 | let handle = scope.spawn(move || {
| ^^^^^^^ But if we introduce a new lifetime with |
So, what this does is it joins after the scope has ended. And why is that a problem? Is it because the scope has already joined, so now we are double-joining?
Oh, so it forgets that However I have not managed to actually get the warning you were talking about, can you reproduce that in a self-contained example?
Oh, good point. With Looks like indeed quantifying it the way you did is expressing the right thing. Ideally we would communicate to the borrow checker that |
Oh, turns out my code does show a migration warning, but playground doesn't show it because there's non-UTF-8 data. @stjepang but you said you had an example that would actually compile with Rust 2015? Mine does not. |
This is not a problem if we ensure the scope joins the thread and saves the returned result in an I honestly don't remember anymore why I wanted to prevent join handles from escaping the scope - maybe UB was possible in an older rustc version because the borrowck had a soundness hole? Or maybe I was just overly cautious and conservatively made
Haha, that's the magic of UB! ✨ If you click "..." next to the "RUN" button and choose "Build", the playground will show warnings. |
All right, I've merged |
This is a really interesting and useful summary! Would it be reasonable to include this in the prior art section? Having an official doc summarizing the path taken here would be awesome. |
Neat, but I deeply object to the "None" under Future Possibilities: The obvious next step is to have a ScopedThreadPool be put into std on top of this. |
It's been a while since I've had anything to do with scoped threads, so I implemented a minimal version without nested spawns the way I remembered things: https://gist.github.com/oliver-giersch/8878d769e47bd97b96aa6833f01d91eb I don't see how you could use the regular |
A pub struct JoinHandle<T>(Inner<T>);
enum Inner<T> {
NormalHandle(/* ... */),
ScopedHandle(/* ... */),
}
There's nothing wrong with leaking join handles outside of the closure as long as the thread is joined before the scope is exited.
Both Note that |
I see, I wasn't aware that this RFC proposes to change to current structure/implementation of the regular However, I am doubtful that this could work without breaking the current API. Currently, pub fn thread(&self) -> &Thread { ... }
pub fn join(self) -> Result<T, Box<dyn Any + Send + 'static>> { ... } I am not certain a 'leaked' scoped join handle could return a I suppose it would be possible to split the EDIT |
A thread could already have exited before calling |
You could be correct, I am not sure. However, there may be a difference between a thread having already exited and it being explicitly joined. This is probably the only (potential) issue, that might impede using |
@oliver-giersch I believe you're right in that it's not possible to extract the panic (the Alternatively, we could change behavior of |
I do not know how relevant this implementation stuff is for the RFC at this point, so reign me in if this goes to far at this point ;) The overall benefit from being able to leak join handles soundly does not outweigh this, so I'd definitely agree with scrapping this idea and keeping a dedicated and unleakable |
I would also suggest changing the return type of the |
@stjepang Been a while since there's been any activity here, do you still feel good about this RFC? If so, perhaps we could consider pinging the libs team to proceed to the next steps. |
@bstrie I still feel good about it -- let's ping the libs team. |
This is echoing some earlier arguments in favor of scoped threads in |
I would like to register strong support here. I can't count the number of examples and documentation that would make use of scoped threads, and be way better for it. Furthermore, at least in the kind of code that I write, often, scoped threads are what I actually want, rather than thread::spawn. thread::spawn's signature is more general, but that means that it can be used in less situations, and I rarely need the power of thread::spawn. |
As a beginner to Rust, I spent a fair amount of time implementing a TCP daemon with multiple threads using standard threading (as covered in the "The Rust Programming Language" book). But actually the workers need to live and die within the scope of a single If it doesn't end up in the standard library, I think it would be nice for the docs to at least mention that scoped threads are a thing. |
We're likely going to discuss this RFC in today's libs team meeting. To prepare I want to summarize the existing comments to give an idea of the issues and questions that have been raised so far.
|
FWIW, I've looked into that third bullet point of yours, @yaahc, and I believe I've been able to come with a sound design that allows to implicitly capture
At that point the outer closure (the one handling the whole scope) takes a
So |
We discussed this in today's @rust-lang/libs meeting. We have consensus that we want this in the standard library, for a variety of reasons, not least of which being able to write tests, examples, and similar. We talked about panic handling behavior, and came to the consensus that we don't want Since the original author of this RFC is no longer on GitHub, we'd appreciate it if someone would pick this up, update it based on the feedback in this comment, and re-submit it. |
I'm enthusiastic about the concept of scoped threads being in the standard library, but I want to add a concern: I'd like to make the same concepts work in async, or at least explore what that will require. It's been a busy couple of weeks, but that's been part of the work that I've been thinking might come out of the vision doc (and, if we do so, I was expecting to think also about the sync case, as I would like to always be moving in the direction of more analogous behavior). |
I would love to see the same concept work for async. Is there any particular design constraint that would necessitate blocking one on the other, though? If the function was to return some special error type that collected multiple panics, it might make sense to make sure the design allows us to reuse that type for both sync and async. However, in the absence of that, is there any part of the design that would be shared between the two? |
The lifetimes and most of the API would be similar, but there is a huge detail in That is, consider: fn main ()
{
let ref a = (|| 42)(); // avoid static promotion
let async_main = async {
let ref b = (|| 27)();
let scope_fut = task::scope(move |scope| async move {
scope.spawn(async move { println!("{}, {}", *a, *b); }); // potentially in another thread
fut_that_yields_pending_at_least_once().await;
/* inserted by `task::scope`'s "runtime":
….spawned_tasks.await; // */
});
pin_mut!(scope_fut);
::futures::future::poll_fn(|cx| {
// poll it until reaching a point where `scope.spawn(…)` has been called,
// so that the subtask gets enqueued onto a background thread
match scope_fut.as_mut().poll(cx) { … }
Poll::Ready(())
}).await;
});
::futures::executor::block_on(async_main);
} Correct me if I am wrong, but I believe such code could suffer from a race condition, whereby the other thread may start spinning the So, I simply fail to see a foolproof On the other hand, we can imagine a fn main ()
{
let ref a = (|| 42)();
some::executor::scoped_block_on(move |scope| async move {
let ref b = (|| 27)();
let handle = scope.spawn(async move {
// println!("{}, {}", *a, *b); /* Error, cannot refer to `b` */
println!("{}", *a);
});
/* stuff */
// optional:
handle.await;
});
} But this means that the implementation of such a function is now tied to an executor 😕 All in all, I find the |
I've filed a fresh PR for scoped threads here: #3151 |
Rendered
Add scoped threads to the standard library that allow one to spawn threads borrowing variables from the parent thread. Example: