Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does rayon::spawn block until a thread is available? #522

Open
vitiral opened this issue Feb 3, 2018 · 12 comments
Open

Does rayon::spawn block until a thread is available? #522

vitiral opened this issue Feb 3, 2018 · 12 comments
Labels

Comments

@vitiral
Copy link

vitiral commented Feb 3, 2018

This is not currently documented. IMO it should block until a thread is available, to prevent firing off a million threads.

Thanks!

@cuviper
Copy link
Member

cuviper commented Feb 5, 2018

Yes and no. The caller of rayon::spawn doesn't block -- it just queues the job and returns. The spawned job itself does conceptually block, but really this means it just sits in the queue until one of the threads grabs it. rayon::spawn never creates threads itself, only uses what's already in the pool.

Also, while I expect your "million threads" was hyperbole, if you really have a lot of things to spawn then structuring it as a join will probably perform better. Using parallel iterators is the easiest way to do that. If you still want it async, you can use a single spawn that then starts the iterator.

@vitiral
Copy link
Author

vitiral commented Feb 5, 2018

Okay, so this is a really bad idea:

for n in my_large_iterator {
    rayon::spawn(|| { /* some work with n */ })
}

Instead you should do something like:

use std::thread::spawn;

let (send_raw, recv_raw) = bounded_channel(128);
let (send_processed, recv_processed) = bounded_channel(128);

spawn(move || {
    for n in my_large_iterator {
        send_raw.send(n);
    }
});

for _ in num_cpus() {
    let recv_raw = recv_raw.clone();
    let send_proccessed = send_proccessed.clone();
    spawn(move || { /* recv data, process it and send it */ }
}

drop(send_raw);
drop(send_processed);

let results = recv_processed.iter().collect();

Note: ergo_sync exists to improve the ergonomics/syntax of doing this kind of thing.

I think there needs to be some discussion at the doc level of what rayon is good for and what it isn't good for. It seems to me (and correct me if I'm wrong) that pretty much its only purpose is to process data structures in parallel. This is a very worthwhile goal!

I'm doing a review of whether to include rayon automatically in the ergo ecosystem and would appreciate feedback (rust-crates/ergo_sync#6) In summary: it seems that rayon should not be used as a "standard thread pool" or when a message-passing scheme is called for.

@vitiral
Copy link
Author

vitiral commented Feb 5, 2018

Just to be clear, I mistakingly thought that rayon could be a good "thread pool" because of spawn and join, and my initial implementation of ergo_sync suggested using it for that purpose. However, I found those constructs very unergonomic and riddled with gotchas like that threads could deadlock if they depend on eachother.

@cuviper
Copy link
Member

cuviper commented Feb 5, 2018

The ideal for rayon is to turn that my_large_iterator into a parallel iterator. This isn't universally possible (#46), but many kinds of iterators can be transformed from the start.

So if your "really bad idea" was actually something like this:

for n in my_vec.iter().filter(...).map(...) {
    rayon::spawn(|| { /* some work with n */ })
}

Then you would do better to write it more like:

my_vec.par_iter().filter(...).map(...)
    .for_each(|n| { /* some work with n */ });

And since your latter example shows collecting results, note that we do have collect() for parallel iterators too. We're trying to support equivalent functionality for as much of Iterator as possible, and even a lot of Itertools. The constraint is usually whether the base iterator is something we can parallelize.

However, I found those constructs very unergonomic and riddled with gotchas like that threads could deadlock if they depend on eachother.

Rayon works best with tree-like dependencies, which is what join creates naturally. One running job splits into two dependencies, which can be executed on any thread. This way waiting threads can also steal from anywhere else without worrying about inverted dependencies.

While spawn is powerful, it can break down if the spawned job carries dependencies unknown to rayon. Actually, that's probably true of join as well -- basically anywhere you make rayon block on a non-rayon thing is asking for trouble. See #396 for an example with rayon-futures.

There's some work in #489 to use "fibers", which may alleviate some of this problem, at least where job dependencies could get inverted. We haven't decided how to proceed with that yet.

@vitiral
Copy link
Author

vitiral commented Feb 5, 2018

Cool, thanks for the clarifications here and in the ergo_sync issue!

As a casual observer I'm wondering what the difficulty is in implementing a naive auto-threadpool for generic iterator. It is kind of annoying that I have to create my own threadpool to do "real" work when I could be using the rayon one.

The main use case for ergo_sync is when I:

  • spawn a bunch of std::thread::spawn threads to do IO bound work and send them over a channel
  • want to "process" the data from recv.iter() in parallel

There is currently no good way to do this with rayon as far as I can tell. It doesn't have to be "rediculously fast" I would just like to be able to leverage the fact that I already have a threadpool for me to use!

@vitiral
Copy link
Author

vitiral commented Feb 5, 2018

Hmm... looking at it more maybe this is a solution:

(adapting the original example)

for _ in rayon::current_num_threads() {
    let recv_raw = recv_raw.clone();
    let send_proccessed = send_proccessed.clone();
    rayon::spawn(move || { /* recv data, process it and send it */ }
}

There would be a couple of qualifications:

  • Don't have two "groups of work", i.e. don't use recv_processed in a rayon::spawn node. All data processing should happen within the spawned threads.
  • However, using rayon parallel iterators/join inside of the rayon::spawn threads is supported.

@nikomatsakis
Copy link
Member

Okay, so this is a really bad idea:

It's not that simple. This will create tasks, but that is a pretty cheap price (it allocates a closure in the heap and pushes it onto a deque) and you have to pay it eventually anyway. That said, the loop with scope-spawn is probably not the best choice; everything else that @cuviper said applies (and, if you use a parallel iterator, you get the added benefit that we will try to lump together multiple items into one task).

@jonhoo
Copy link

jonhoo commented Apr 3, 2018

I think this is somewhat related to what I'm trying to work around in #544. Specifically, I want to use rayon::ThreadPool as a more general-purpose thread pool for compute or I/O (i.e., one where there aren't really data-dependencies). Just like @vitiral, I'd like to avoid implementing a thread pool + job stealing myself if I can avoid it, but it seems like there are some methods missing that would be necessary for that use-cases. I think they should be pretty straightforward to add, but that brings us back to whether this is a use-case that rayon wants to support?

For example, methods like a) seeing how many outstanding requests there are; b) blocking until a spawned job has been taken by a worker; and c) dropping the pool without finishing all spawned jobs, would be necessary to deal with use-cases like that. Maybe even a spawn_timeout, though that may prove trickier to add. Similarly, the ability to more easily give thread local state (touched upon in #493) and access that state from within jobs would be super useful for this kind of use-case. Or the ability to run a particular closure before each job a worker executes (#544 (comment)).

EDIT: In fact, maybe it would be a good idea to spin off just the thread pool into its own crate, where such features could be added and maintained? And then rayon could use that thread pool internally.

@vitiral
Copy link
Author

vitiral commented Apr 3, 2018

I too would love to be able to access the "rayon thread pool" as it's own independent entity with a more full-featured API for a thread pool (but more light-weight/transparent than rayon itself).

@nikomatsakis
Copy link
Member

My plan for that use case was to support spawning futures, but we haven't made much progress on that. I am trying to put some time into rayon on a regular basis now, though, so I hope to make progress there. I have to sort of bring it all back into my brain to remember what my plan was tho =)

@jonhoo / @vitiral — if either of you are interested in helping, let me know

@NickHu
Copy link

NickHu commented Oct 19, 2021

I believe that rayon::spawn can block, when it calls increment_terminate_count on its registry (as this increments an atomic usize). This means that you can't use it in the main thread in a wasm-bindgen-rayon application, as in that context the main thread is never allowed to block.

@cuviper
Copy link
Member

cuviper commented Oct 19, 2021

If you worry about atomics, then almost no cross-thread communication is possible. But that's not usually considered "blocking", even in rudimentary cases where they are implemented as compare-exchange loops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants