Workers on same host: include code from filesystem, not node1 #6855

timholy · 2014-05-15T15:36:31Z

I've found that when using multiple processes, loading a large pile of code is particularly slow. This patch narrows the gap somewhat, for the case when all processes are on the same host. It works by avoiding interprocess communication and fetching the source files directly from the filesystem, presumably reducing the amount of time workers spend waiting for node 1 to get around to serving their requests.

Timing results:
For using Distributions:
Single-threaded (julia): ~15s
Multi-process, master (julia -p 1): ~24s
Multi-process, this patch (julia -p 1): ~20s

For using Optim (with its own internal using Distributions commented out):
Single-threaded (julia): ~4.2s
Multi-process, master (julia -p 1): ~7.3s
Multi-process, this patch (julia -p 1): ~5.7s

If anyone has suggestions to reduce the overhead even further, I'm all ears.

timholy · 2014-05-15T16:47:10Z

I noticed one place where this isn't "free": push!(LOAD_PATH, newpath) needs to become @everywhere push!(LOAD_PATH, newpath).

eschnett · 2014-08-24T18:48:09Z

One approach would be to introduce a tree-structure for these requests. That is, the nodes are grouped hierarchically, and instead of requesting things from node 1, they request it from the next layer up in the hierarchy. This layer may need to forward the request to the next layer. This increases the latency a bit, since the requests need to be propagated, but with a bit of caching, node 1 has much less work to do. My estimate would be that somewhere between 10 and 100 nodes, such an approach will be significantly faster.

timholy · 2015-05-16T14:22:04Z

🎂 Happy 1 year old, PR! My, how you've grown!

I haven't been developing multiprocess stuff much recently, so comments from people who do are still desired. Is this still relevant? Any thoughts? I agree that @eschnett's idea might be better for a large cluster, but without such a cluster to test on I'd hesitate to develop a more complex approach. So if this seems like a step in the right direction, this is about as far as I want to take this now.

tkelman · 2015-05-16T22:21:40Z

related, I think? #11093

timholy · 2016-04-20T13:41:33Z

I suspect this is irrelevant now.

Workers on same host: include code from filesystem, not node1

eabdd0f

jiahao force-pushed the master branch 3 times, most recently from 6c7c7e3 to 1a4c02f Compare October 11, 2014 22:06

jiahao force-pushed the master branch from cdde4df to 7fdc860 Compare October 28, 2014 04:20

MikeInnes force-pushed the master branch from 5c60996 to b1c3df3 Compare November 14, 2014 17:07

tkelman added the parallelism label May 16, 2015

timholy mentioned this pull request May 20, 2015

Base.runtests() hangs forever #10513

Closed

timholy closed this Apr 20, 2016

timholy deleted the teh/multiprocloading branch April 20, 2016 13:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workers on same host: include code from filesystem, not node1 #6855

Workers on same host: include code from filesystem, not node1 #6855

timholy commented May 15, 2014

timholy commented May 15, 2014

eschnett commented Aug 24, 2014

timholy commented May 16, 2015

tkelman commented May 16, 2015

timholy commented Apr 20, 2016

Workers on same host: include code from filesystem, not node1 #6855

Workers on same host: include code from filesystem, not node1 #6855

Conversation

timholy commented May 15, 2014

timholy commented May 15, 2014

eschnett commented Aug 24, 2014

timholy commented May 16, 2015

tkelman commented May 16, 2015

timholy commented Apr 20, 2016