-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent API for embarrassingly parallel routines between levels of parallelism #17887
Comments
There is also a need for a higher level API that works across process and threads. Just thinking out loud here:
User code will only ever use |
That would be amazing, a simple abstraction beyond threading and multiprocessing. Maybe this should be expanded to be about standardized tooling for embarrassingly parallel routines. For multiprocessing we have For naming, I think that instead of |
It would be an annoying deprecation, but I would propose this is a much better naming scheme:
But I lost an argument about calling distributed stuff "parallel" a long time ago, and now I'm not sure it would be worth going through the multi-version deprecation and renaming this would require. |
Do I understand correctly that the distinction is between threads and processes rather than threads and nodes? If so, alongside 'thread' might some form of the word 'process' be more accurate than 'distribute', processes not necessarily being distributed across nodes? Forgive my ignorance. Best! |
Yes, it's more of a distinction between threads and processes. You can have multiple independent processes running on the same computer (or node), so it's not necessarily what is usually meant by distributed (although it can do distributed). But the word "process" wouldn't be smart if we want to extend the |
In the far future (say, a year from now), threading will work out of the box and will be efficient. I assume people will then basically want to use threading all the time when they are using distributed computing, e.g. to handle latencies. Thus the case "distributed, but not threaded" doesn't seem terribly important -- it is important now, but probably won't be in the future. This would then lead to people using I'd thus suggest to go for |
Beyond bikeshedding (I personally like Things that are needed to make it easier
The last point is needed, because we probably want to support asymmetric clusters, at least in terms of number of processors (if not interms of speed). I know my normal cluster is 12core + 12core + 4 core. |
I don't think that's true in all cases: Sure, many user applications will just want stuff to be run in parallel using both multiple hosts and multiple threads. But more complex applications will sometimes need more control over what is done via threads and what is done distributed: For example, data partitioning/placement may may have to be taken into account. Or a complex algorithm may choose to distribute an outer loop (that is not sensitive to latency), but run an inner loop (e.g. a latency-sensitive one) on threads, with several layers of code in between. |
Now that Threads has matured a bit, has there been any more thought to supporting similar functionality as Distributed? For example, I (perhaps naively) am surprised not to see an equivalent Threads function for Distributed's function tmap(f, xs::AbstractArray)
g = Base.Generator(f,xs)
et = Base.@default_eltype(g)
a = Array{et}(undef, length(xs))
Threads.@threads for i in 1:length(xs)
a[i] = f(xs[i])
end
a
end |
FYI, Transducers.jl supports "two-level" parallelism as of v0.4.11; i.e., each worker process uses multiple threads for executing See also: |
Seems like this issue is still relevant. |
It seems like it would be natural for
@threads
loops to allow for a reduction parameter, matching what's done for@parallel
. In fact, it seems natural enough that the documentation has to make a specific mention that there isn't one. I propose that it be pretty much the same as@parallel
, except over threads.The text was updated successfully, but these errors were encountered: