-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
distributed and/or asynchronous algorithms for numeric methods #68
Comments
Most of what you say seems to make sense to me. I'll echo some things that I'm hearing back to you to make sure that I'm hearing them correctly.
If this is what you're saying then my only concern is that we're assuming that the function evaluation is also dask-y, which is only sometimes the case (although maybe this is the case that you're trying to motivate). I think it could be fun and informative to see how well approaches like this scaled. I think it's a good problem is your goal is to learn about how to use Dask well. It stresses both collections like dask array and possibly also some of the async interface. It'll force you to care about performance, which will get you into diagnostics and maybe thinking about how the scheduler works. However, I'm not personally qualified to say if this is useful or not. I don't know that I personally have enough experience here. I would recommend relying on some of the senior folks around you to determine the value of faster algorithms in this space. Chunked linear algebra methods are kind of in the same situation. I think that this would be a super-fun project to work on. It would teach blocked algorithms and distributed performance. I think it's an open question of how well a dynamic system like Dask can compete with more static/hardened systems based on MPI, predefined communication patterns, etc.. If the answer is "almost as well" then this could grow into an interesting topic. You would want to check in with numerical linear algebra folks within a month or two though to ensure that you're testing the things that they would respect. I'll cc @poulson here to see if he'll bite. In regards to repositories this one is somewhat quiet. Defaulting to dask/dask when unsure is probably wise. I think that you'll get more people than just me commenting there. |
Thanks for the response. I'll close this issue, re-post this in the dask/dask repository, and quote your post with my response. |
Hi all,
I am a second year PhD student in mechanical engineering at UC Davis studying numeric methods for optimal control. I just learned about dask and I'm excited about using it for the types of problems I do in my research. I think it might already, or be very close to, be a really great way to increase performance for common numeric methods in scientific computing.
In general, it seems that if an implementation of a method that maximizes the number of "sequential" computations between control flow structures and using concurrent futures for control flows (like error checks) would get significant speed boosts since the dask scheduler optimizes and distributes the computational graph. For example, if one were to daskify an algorithm like a Runge-Kutta ODE solver and all the derivative callbacks, the scheduler should automatically perform the equivalent of common sub-expression collection and the parallelization of inter-step computations. Then dask-based higher level logic (such as comparing the results of multiple simulations) or algorithms (like shooting methods for solving boundary value problems) could add the lower-level computation tasks to the scheduler so computation could be processed in parallel without the user having to choose one level to parallelize at.
This leads to some questions:
Thanks,
Ben
Related issues:
dask/dask#2225
#3
#5
#57
P.S. I'm posting in this repo because more of the related issues that I could find were here; I'd be happy to close this and re-post in another repository if it would be more appropriate. Just let me know.
The text was updated successfully, but these errors were encountered: