distributed and/or asynchronous algorithms for numeric methods #68

sixpearls · 2018-03-02T21:31:30Z

Hi all,

I am a second year PhD student in mechanical engineering at UC Davis studying numeric methods for optimal control. I just learned about dask and I'm excited about using it for the types of problems I do in my research. I think it might already, or be very close to, be a really great way to increase performance for common numeric methods in scientific computing.

In general, it seems that if an implementation of a method that maximizes the number of "sequential" computations between control flow structures and using concurrent futures for control flows (like error checks) would get significant speed boosts since the dask scheduler optimizes and distributes the computational graph. For example, if one were to daskify an algorithm like a Runge-Kutta ODE solver and all the derivative callbacks, the scheduler should automatically perform the equivalent of common sub-expression collection and the parallelization of inter-step computations. Then dask-based higher level logic (such as comparing the results of multiple simulations) or algorithms (like shooting methods for solving boundary value problems) could add the lower-level computation tasks to the scheduler so computation could be processed in parallel without the user having to choose one level to parallelize at.

This leads to some questions:

Does it seem like I’m understanding how to structure this type of work with dask correctly?
Would there be interest in daskifying numeric algorithms like Runge-Kutta ODE solvers or “small” optimization methods? And/or implementing inherently distributed and/or asynchronous algorithms (I see several issues related to ADMM for convex optimization)?
I also see an issue for chunked linear algebra methods; is there any interest in improving performance for dense (small/medium sized) linear algebra for single workers (i.e., calling the best BLAS/LAPACK calls), perhaps through some data modeling? A combination of the chunked and optimized BLAS/LAPACK calls seems very compelling to me, but I havne't looked at this very thoroughly yet.

Thanks,
Ben

Related issues:
dask/dask#2225
#3
#5
#57

P.S. I'm posting in this repo because more of the related issues that I could find were here; I'd be happy to close this and re-post in another repository if it would be more appropriate. Just let me know.

mrocklin · 2018-03-02T22:35:38Z

Most of what you say seems to make sense to me. I'll echo some things that I'm hearing back to you to make sure that I'm hearing them correctly.

You've identified that there are two opportunities for parallelism in common ODE solvers, both within function evaluation, and across multiple function evaluations (and various other co-evaluations). Using a system that can reason and mix between both levels (as dask appears to do) might be valuable.
Hopefully if using something like dask.array some common structure can be found between the various function evaluations, and give a further speed boost.

If this is what you're saying then my only concern is that we're assuming that the function evaluation is also dask-y, which is only sometimes the case (although maybe this is the case that you're trying to motivate).

I think it could be fun and informative to see how well approaches like this scaled. I think it's a good problem is your goal is to learn about how to use Dask well. It stresses both collections like dask array and possibly also some of the async interface. It'll force you to care about performance, which will get you into diagnostics and maybe thinking about how the scheduler works.

However, I'm not personally qualified to say if this is useful or not. I don't know that I personally have enough experience here. I would recommend relying on some of the senior folks around you to determine the value of faster algorithms in this space.

Chunked linear algebra methods are kind of in the same situation. I think that this would be a super-fun project to work on. It would teach blocked algorithms and distributed performance. I think it's an open question of how well a dynamic system like Dask can compete with more static/hardened systems based on MPI, predefined communication patterns, etc.. If the answer is "almost as well" then this could grow into an interesting topic. You would want to check in with numerical linear algebra folks within a month or two though to ensure that you're testing the things that they would respect. I'll cc @poulson here to see if he'll bite.

In regards to repositories this one is somewhat quiet. Defaulting to dask/dask when unsure is probably wise. I think that you'll get more people than just me commenting there.

sixpearls · 2018-03-03T00:28:31Z

Thanks for the response. I'll close this issue, re-post this in the dask/dask repository, and quote your post with my response.

sixpearls closed this as completed Mar 3, 2018

sixpearls mentioned this issue Mar 3, 2018

distributed and/or asynchronous algorithms for numeric methods dask/dask#3241

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

distributed and/or asynchronous algorithms for numeric methods #68

distributed and/or asynchronous algorithms for numeric methods #68

sixpearls commented Mar 2, 2018

mrocklin commented Mar 2, 2018

sixpearls commented Mar 3, 2018

distributed and/or asynchronous algorithms for numeric methods #68

distributed and/or asynchronous algorithms for numeric methods #68

Comments

sixpearls commented Mar 2, 2018

mrocklin commented Mar 2, 2018

sixpearls commented Mar 3, 2018