RFC: Scoped threads, take 2 #1084

aturon · 2015-04-23T00:12:48Z

This RFC proposes an alternative to the thread::scoped API that does not rely
on RAII, and is therefore memory safe even if destructors can be leaked.

This RFC was inspired by ideas from @nikomatsakis and @arielb1.

Rendered

reem · 2015-04-23T00:26:55Z

text/0000-scoped-take-2.md

+Note that this is just a sketch: the use of a linked list here, and `RefCell`
+internally, could both be switched for something more interesting.
+
+# Drawbacks


More than the ergonomics, the fundamental drawback of this approach is that it is an inversion of the normal RAII flow: functions acquiring some resource return a guard (representing the resource itself) which can be dropped to release the resource. If you acquire an RAII resource internally, you can provide the same API by forwarding the guard.

With APIs like the above, this is reversed - to use an RAII resource, you accept a guard as an argument rather than returning it, and you cannot acquire the resource internally and return it, you must have it passed to you.

This is not necessarily a death blow, just a significant drawback in terms of consistency. For instance, the original thread::scoped API meant that thread::spawn was really just a special case of scoped where 'a is 'static and the default behavior of the guard is different. This API is fundamentally different, and encourages totally different patterns.

bstrie · 2015-04-23T01:40:43Z

I think we should consider this even if Leak or something analogous were accepted. There's a discussion to be had here over whether this API should be different from thread::spawn, rather than automatically assuming that RAII guards are necessarily the best approach.

Diggsey · 2015-04-23T02:17:27Z

This pattern seems more general than just "thread::scoped": there's no reason why the "scope" functionality should be part of "thread". What if there was a general purpose "scope" method, and then a "thread::spawn_scoped" method which took the scope as the first parameter. That way you can reuse the functionality for other APIs/attach your own callbacks to run at the end of the scope.

Stebalien · 2015-04-23T02:36:03Z

text/0000-scoped-take-2.md

+
+# Unresolved questions
+
+Is there any reason that `Scope` should be `Send` or `Sync`?


Recursion without defining new scopes and unnecessarily tying up system resources by keeping old threads around.

I can't see any reason for Send (we never get an owned Scope so we can't possibly send it anywhere) but we definitely want Sync, so we can pass the Scope to a fork-join thread in case it needs to conditionally spawn more threads (or otherwise register "finalizers" that work on shared stack data).

I would add that the correct question should be "is there a reason not to make Scope Send and/or Sync." Rust should deliver maximum flexibility for concurrency, so long as it can do so safely.

@pythonesque Fair point, although if making it Send and/or Sync requires using more expensive operations (e.g. it's internally mutable through a &-reference so any mutations would need to be thread-safe), then it's a valid question to ask.

tomaka · 2015-04-23T04:55:35Z

The main drawback of this approach is that it is arguably less ergonomic (and less pretty) than the original thread::scoped API.

I don't like being stuck on an ugly API like this one forever just because versioon 1.0 didn't provide a way to have a good API.

Imagine in five years someone asking "why is thread::scoped so weird?" and having to answer that it's because of some language features were missing at that time.

sfackler · 2015-04-23T05:13:54Z

Imagine in five years someone asking "why is thread::scoped so weird?" and having to answer that it's because of some language features were missing at that time.

Languages evolve. There are enormous portions of the current standard library that would look different if they had been designed in the presence of future language features. That doesn't mean we're going to delete e.g. Vec until it's possible to define its iterator types as associated items or whatever.

We already have many "destined to be deprecated" APIs like thread::sleep_ms and Condvar::wait_timeout_ms that exist because the cost of not being able to sleep or wait with a timeout is way higher than the cost of spending 10 seconds to explain why both thread::sleep and thread::sleep_ms exist at some point in the future.

lilyball · 2015-04-23T05:47:39Z

A general 👍 to this. I hope someday we can recover the more "straightforward" RAII style API, but this provides a good solution in the short term.

terpstra · 2015-04-23T07:34:07Z

I really like how the same guard can be reused by any library which needs hard RAII requirements.

aidancully · 2015-04-23T07:52:00Z

I really like the design, just have two small comments:

I agree with @Diggsey and @terpstra that this is more general than threads, and think the basic mechanism (for enforcing that an RAII destructor must be invoked) belongs somewhere else, probably in libcore.
The spawn function then wouldn't be part of Scoped, and should be defined under std::thread.

nagisa · 2015-04-23T16:21:36Z

I prefer this proposal to other two. It might be because coming from Haskell, such a pattern is very common in API which manage resources with a limited lifetime. Haskell’s syntax makes that pattern very viable too, but I digress.

Several notes in response to both comments and RFC:

While the pattern is itself general, generalising Scope makes little sense, because each use of this pattern usually has wildly different requirements. Applications of this pattern where multiple destructors/guards must be kept track of is very rare, for example.

I also don’t think this is an ugly/unergonomic API. It is just a pattern which is uncommon in the Rust standard library (at the moment).

Also, I’d want that thread::Builder this still works for scoped thread creation. So far RFC only discusses top-level thread::scoped. For example, how would one create each scoped thread with a distinct name, but not bother setting the same custom stack size for each one?

// this would make some sense from API standpoint, not sure about implementation
thread::Builder::new().stack_size(1024).scoped(|s|{
    for i in 0..10 {
        s.name(format!("worker-{}", i)).spawn(||{ … });
    }
});

glaebhoerl · 2015-04-23T19:58:20Z

I think being able to use more-natural RAII patterns rather than with_resource-style HOFs (which invert control and result in various limitations, especially in a natively imperative language like Rust) for resource management is an advantage of Rust over Haskell. Anyone else remember how Rust used to use HOFs for ("internal") iteration, and later moved to the ("external") Iterator model? Similar reasoning applies. So while I don't think having to revert to the HOF solution for particular cases is horrible, I do think it's a bit unfortunate.

krdln · 2015-04-23T22:32:31Z

Additionally to feeling more natural and simpler (but that might be subjective), RAII style scoped threads have one another advantage – it's really straightforward to return values from from threads, check their exit status, etc. I guess it would be a lot more bloated with this approach. And I'm pretty sure we'll find some major limitations to the closure based approach as we found in internal iterators.

liigo · 2015-04-24T10:43:51Z

this is a much more complex, non-straightforward and uncommon api design, than the old thread::scoped. while Files still using RAII style to clean up its handle using destructors, which are not guaranteed to be run. i'm worried.

arielb1 · 2015-04-24T12:17:00Z

@nagisa

This pattern is actually more similar to Haskell's ST monad than to the with-pattern, despite the actual use not really needing HRTB.

arielb1 · 2015-04-24T12:19:11Z

@krdin

The scope guard is only needed here to ensure the lifetime bound - you create a single scope guard, and can use it multiple times.

nagisa · 2015-04-24T12:51:59Z

this is a much more complex, non-straightforward and uncommon api design, than the old thread::scoped

Alternative is having neither.

while Files still using RAII style to clean up its handle using destructors, which are not guaranteed to be run. i'm worried.

“Forgetting” a File is not a memory-safety hazard, though. You could always forget, too.

theemathas · 2015-04-25T02:24:31Z

What happens if I do this?

use std::thread::{self, JoinHandle};
let wrong_handle: JoinHandle<&i32> = thread::scope(|s| {
    let inner_handle: JoinHandle<&i32> = s.spawn(|| { 42 });
    inner_handle
    // the spawned thread should be joined here
};
// but I can still access the spawned thread here!

drop(wrong_handle);
// Oops. I detached a thread that is supposedly scoped.

// Or alternatively, this:
// wrong_handle.join().unwrap();
// Oops. I joined a thread twice.

My suggestion is to

Add a lifetime parameter to JoinHandle
Make thread::spawn() return JoinHandle<T, 'static>
Make thread::Scope::spawn have the following signature:

impl<'defer> Scope<'defer> {
    pub fn spawn<'body, F, T>(&self, f: F) -> thread::JoinHandle<T, 'body> where
        'defer: 'body,   // anything borrowed by the body must outlive the Scope
        F: FnOnce() -> T + Send + 'body,
        T: Send;
}

reem · 2015-04-25T02:59:40Z

@nagisa the alternative is not neither, and it's a false dilemma to phrase the problem that way. Leak solves all of the memory safety issues with the old API, and allows us to continue to write RAII guards which must have their destructors run in their lifetime for memory safety.

stevenblenkinsop · 2015-04-25T04:43:09Z

@theemathas The whole point of this proposal is that you only get a handle, not a guard, so what you do with it isn't important. The scope ensures that the thread has joined by the time it exits, with no involvement from the handle. The handle is just a way to access the return value of the thread if you want. Whether that happens before the scope exits or after can hardly make a difference. The thread could've exited already once you access it either way.

aturon · 2015-05-01T16:31:26Z

I agree with @kballard and others that having a distinct return type that does not use Result is the right way to go; when I wrote JoinHandle in the RFC I had neglected to think about this distinction.

FWIW, I am also leaning toward @nikomatsakis's suggestion (made elsewhere) that we should probably not stabilize any of this soon, in any case -- we should iterate on it in crates.io before bringing it into std. We may want to explicitly deprecate thread::scoped in favor of such an external crate (assuming that consensus is to go in this direction rather than one of the proposals that makes the existing scoped API valid).

stevenblenkinsop · 2015-05-01T18:21:42Z

@kballard Yes, that makes sense. I hadn't thought about it in terms of introducing panicking into the JoinHandle API. Having separate types did seem like the better option. I was trying to show that there might be a tradeoff involved, but that makes it more clear cut.

RalfJung · 2015-05-09T10:06:44Z

Would the following be allowed by the proposed API?

// using the proposed thread::scope API
fn new_trpl_example2() {
    thread::scope(|s| {
            s.spawn(|| {
                s.spawn(|| { /* some code */ });
                /* some more code */
            });
        }
    })
}

This is trying to use the "outer" scope within an "inner" thread. The result is that the innermost thread could potentially outlive the "middle" thread.

The reason I am asking is that if this is not allowed, then it seems like the scoped-spawn API restricts concurrency to "series-parallel" concurrency (which, I believe, the old scoped API did). That would have some very nice theoretical benefits and simply some analysis (e.g., https://cims.nyu.edu/~ejk/papers/crd.pdf).

theemathas · 2015-05-10T02:30:49Z

@RalfJung That code would be invalid if Scope is neither Sync nor Send.

RalfJung · 2015-05-10T09:20:32Z

Which is a question that the RFC leaves unanswered. So I guess we'll have to wait till that point settles.

lilyball · 2015-05-14T07:23:36Z

@RalfJung That's why I said earlier that it should be Sync.

gkoz · 2015-05-14T12:54:15Z

I might have an application of this API in a single threaded async case.

Imagine you're writing bindings to a foreign library (e.g. GTK) that supports callbacks like this:

extern fn connect(instance: *mut Object,
                  callback: extern fn(receiver: *mut Object, data: *mut ()),
                  data: *mut (),
                  destroy_data: extern fn(data: *mut())) -> usize;
extern fn disconnect(handle: usize);

You could support using a Fn() + 'a as a callback by giving the library a pointer to Box<Fn() + 'a> as data. A Scope would let you provide a guarantee that the closure doesn't outlive 'a.

To support Fn() + 'static it could also be useful to have a static STATIC_SCOPE: Scope<'static>, maybe even a STATIC_SCOPE_FORGET where defer would be a no-op. Or is this even the only way a static Scope could work because nothing runs outside of main?

theemathas · 2015-05-17T05:03:26Z

I would like to propose an amendment to this RFC in order to reduce the difference between thread::Scope::spawn() and thread::spawn().

In thread::Scope make Scope::spawn return a ScopedJoinHandle<'a, T>. New signature:

impl<'a> Scope<'a> {
    pub fn spawn<F, T>(&self, f: F) -> thread::ScopedJoinHandle<'a, T> where
        F: FnOnce() -> T + Send + 'a,
        T: Send + 'a;
}

Threads are joined when one of the following happens
- thread::scope() ends and ScopedJoinHandle::detach() is not called for that thread (panics propagated)
- ScopedJoinHandle::join() is called (panics propagated)
- JoinHandle::join() is called (panics not propagated)
JoinHandle::join() returns a Result, but ScopedJoinHandle::join() doesn't.
Add a method to ScopedJoinHandle:

impl<T: Send + 'static> ScopedJoinHandle<'static, T> {
    pub fn detach(self) -> JoinHandle<T> {...}
}

(optional) Add a method to Scope:

impl<'a> Scope<'a> {
    pub fn attach<T: Send + 'static>(&self, handle: JoinHandle<T>) -> ScopedJoinHandle<'static, T> {...}
}

(optional) In mod thread, add pub static STATIC_SCOPE: Scope<'static> = ... (requires that Scope is Sync)

gkoz · 2015-05-17T17:28:56Z

pub static STATIC_SCOPE: Scope<'static> = ... (requires that Scope is Sync)

Oh, I hadn't considered that limitation. Adding a pub fn static_scope() -> &Scope<'static> may not require Sync though.

llogiq · 2015-05-28T07:36:09Z

@theemathas what would attaching a JoinHandle to a scope mean? That it's automatically joined once the scope exits?

@aturon I'm quite happy with the API, apart from the part where we require allocation. Would it be possible to add an implementation of FixedScope that can only spawn a limited number of threads (possibly giving up with panic! when more than that number are spawned)? This could greatly benefit high performance applications.

Note that it may even be possible to create a scope! macro via a compiler plugin that would create a FixedScope if the contained code can be guaranteed to only spawn a fixed number of threads by analyzing it (a naive version would bail once a loop is detected or any method gets applied to any contained closure in which threads are spawned), otherwise it could create a normal Scope.

Kimundi · 2015-05-28T09:46:59Z

@llogiq: Alternatively, there could be a library that allows caching the threads and allocations, maybe thats good enough?

llogiq · 2015-05-28T09:57:43Z

@Kimundi: Probably yes. FixedScope would be incompatible to Scope anyway, so we're not losing any extensibility (that is, code using Scope would have to be changed to use FixedScope wherever needed).

I don't expect Scope to be used much in fn arguments, so type inference should mostly take care of this anyway.

bluss · 2015-05-28T11:39:42Z

@llogiq I don't immediately buy that the allocation is a major impact. Not compared to creating new OS threads. But it does bring to mind -- to what extent does this RFC support scoped thread pools, how can they be built upon this?

llogiq · 2015-05-28T12:10:07Z

@bluss: Good point (though it may somewhat depend on the OS). A ScopedThreadPool would limit the number of concurrently spawned threads by construction (and thus make the limitation of a theoretical FixedScope a moot point).

Kimundi · 2015-05-28T14:27:55Z

I actually have a prototype implementation of scoped with a thread pool: https://github.com/Kimundi/scoped_thread_pool_cache/blob/master/src/lib.rs.

Right now its memory unsafe though due to a bug mentioned further up in the comments that makes the scoped API as proposed here not work on todays Rust.

pythonesque · 2015-06-16T23:22:54Z

If you can't afford allocation, thread::scoped is already too slow because it uses Arc internally.

NawfelBgh · 2015-06-27T00:45:30Z

I don't understand why you build spawn on top of a general defer method which works with boxed closures and than complain that this new API requires allocations. Why don't you just put the (statically sized) join guards in the scope's linked list (or vector) directly?

struct DtorChain {
    jgs: Vec<JoinGuard>
}

llogiq · 2015-06-27T16:46:03Z

Because statically sizing that Vec may not be feasible in all Casey; you'd need to prove an upper bound at compile time.

For simple cases, it could be quite useful though, and there is a design space in conjunction with a thread pool that people currently explore out of tree if I'm correctly informed.

NawfelBgh · 2015-06-28T11:19:40Z

@llogiq, notice that i don't put OnceFn() -> () in the drop chain. I put the join guards directly (Vec<JoinGuard>). And join guards are statically sized.

Am i missing something obvious?

llogiq · 2015-06-28T21:50:11Z

Yes. The JoinGuards may be statically sized. But a Vec usually isn't. To avoid heap allocation, you'd need a statically sized Array instead.

NawfelBgh · 2015-06-29T11:42:28Z

True. I should have phrased it better. I was talking about the excess of scattered allocations caused by Box<FnOnce>.

aturon · 2015-07-01T16:18:47Z

I'm going to close this RFC for the time being. We need to solve the soundness issue pointed out by @kballard at the language level first, and I'd also like to prototype this in an external crate before landing in std.

kevinmehall · 2015-08-16T18:02:55Z

The borrowck bug making this unsound seems to have been fixed by rust-lang/rust#27641.

@kballard's prototype now fails to compile with the expected error:

% rustc -V
rustc 1.4.0-nightly (f05b22efb 2015-08-15)
% rustc safe-scoped.rs 
safe-scoped.rs:98:21: 101:14 error: `i` does not live long enough
safe-scoped.rs: 98             s.spawn(|| {
safe-scoped.rs: 99                 let mut data = data.lock().unwrap();
safe-scoped.rs:100                 data[i] += 1;
safe-scoped.rs:101             });
note: in expansion of closure expansion
safe-scoped.rs:98:21: 101:14 note: expansion site
note: in expansion of for loop expansion
safe-scoped.rs:97:9: 102:10 note: expansion site
note: in expansion of closure expansion
safe-scoped.rs:96:11: 103:6 note: expansion site
safe-scoped.rs:96:5: 103:7 note: reference must be valid for the call at 96:4...
safe-scoped.rs: 96     scope(|s| {
safe-scoped.rs: 97         for i in 0..2 {
safe-scoped.rs: 98             s.spawn(|| {
safe-scoped.rs: 99                 let mut data = data.lock().unwrap();
safe-scoped.rs:100                 data[i] += 1;
safe-scoped.rs:101             });
                   ...
safe-scoped.rs:97:13: 97:14 note: ...but borrowed value is only valid for the for at 97:12
safe-scoped.rs:97         for i in 0..2 {
                              ^
note: in expansion of for loop expansion
safe-scoped.rs:97:9: 102:10 note: expansion site
note: in expansion of closure expansion
safe-scoped.rs:96:11: 103:6 note: expansion site
error: aborting due to previous error

bluss · 2015-08-22T21:50:01Z

text/0000-scoped-take-2.md

+```rust
+pub struct Scope<'a> {
+    dtors: RefCell<Option<DtorChain<'a>>>
+}


Let's point out that Scope<'a> must be invariant in 'a (like it is in this sketch, since it is nested inside a Cell).

RFC: Scoped threads, take 2

49085d3

huonw mentioned this pull request Apr 23, 2015

RFC: Alter mem::forget to be safe #1066

Merged

reem reviewed Apr 23, 2015
View reviewed changes

Stebalien reviewed Apr 23, 2015
View reviewed changes

sfackler mentioned this pull request Apr 23, 2015

Leak and Destructor Guarantees #1085

Closed

nrc assigned aturon Apr 23, 2015

Fixes to lifetimes for spawn

75db90d

aturon force-pushed the scoped-take-2 branch from 26f281a to 75db90d Compare April 25, 2015 06:21

nikomatsakis mentioned this pull request May 1, 2015

Guaranteed non-static destructors #1094

Closed

alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label May 18, 2015

vinipsmaker mentioned this pull request Jun 19, 2015

Build failure with rust nightly maidsafe-archive/crust#149

Closed

aturon closed this Jul 1, 2015

nikomatsakis mentioned this pull request Aug 11, 2015

RFC: Stabilize catch_panic #1236

Merged

bluss reviewed Aug 22, 2015
View reviewed changes

bluss mentioned this pull request Oct 28, 2015

1.4 release notes rust-lang/rust#29416

Merged


		# Unresolved questions

		Is there any reason that `Scope` should be `Send` or `Sync`?

RFC: Scoped threads, take 2 #1084

RFC: Scoped threads, take 2 #1084

Conversation

aturon commented Apr 23, 2015

reem Apr 23, 2015

Choose a reason for hiding this comment

bstrie commented Apr 23, 2015

Diggsey commented Apr 23, 2015

Stebalien Apr 23, 2015

Choose a reason for hiding this comment

lilyball Apr 23, 2015

Choose a reason for hiding this comment

pythonesque Apr 23, 2015

Choose a reason for hiding this comment

lilyball Apr 25, 2015

Choose a reason for hiding this comment

tomaka commented Apr 23, 2015

sfackler commented Apr 23, 2015

lilyball commented Apr 23, 2015

terpstra commented Apr 23, 2015

aidancully commented Apr 23, 2015

nagisa commented Apr 23, 2015

glaebhoerl commented Apr 23, 2015

krdln commented Apr 23, 2015

liigo commented Apr 24, 2015

arielb1 commented Apr 24, 2015

arielb1 commented Apr 24, 2015

nagisa commented Apr 24, 2015

theemathas commented Apr 25, 2015

reem commented Apr 25, 2015

stevenblenkinsop commented Apr 25, 2015

aturon commented May 1, 2015

stevenblenkinsop commented May 1, 2015

RalfJung commented May 9, 2015

theemathas commented May 10, 2015

RalfJung commented May 10, 2015

lilyball commented May 14, 2015

gkoz commented May 14, 2015

theemathas commented May 17, 2015

gkoz commented May 17, 2015

llogiq commented May 28, 2015

Kimundi commented May 28, 2015

llogiq commented May 28, 2015

bluss commented May 28, 2015

llogiq commented May 28, 2015

Kimundi commented May 28, 2015

pythonesque commented Jun 16, 2015

NawfelBgh commented Jun 27, 2015

llogiq commented Jun 27, 2015

NawfelBgh commented Jun 28, 2015

llogiq commented Jun 28, 2015

NawfelBgh commented Jun 29, 2015

aturon commented Jul 1, 2015

kevinmehall commented Aug 16, 2015

bluss Aug 22, 2015

Choose a reason for hiding this comment