-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lwt_domain: an interface to multicore parallelism #860
Conversation
Hello, Thanks a lot for this PR! I'm looking forward to multicore-lwt! I've just had a quick look. (I'll do a longer review later.) I have a few questions/comments:
|
Lwt_offload should be in core/ indeed -- it's not part of the Unix-specific part of multicore OCaml. @Sudha247 it would be good to get more of your benchmarks into the test suite here. For example the fib run would be ideal to check that you get the same results and also provide an example. I'm just discussing with caml-devel the right way to be able to release this with Lwt_offload as an optional module -- I'll update when I find consensus. |
Thanks for the comments @raphael-proust and @avsm! Agree about moving this module to core. The notification call is just for convenience, I'll work on removing this part after which there won't be any dependency on lwt_unix, so then the module will be ready to be moved to core. Having more tests for the module would certainly be useful, I'll add the fib example to the testsuite. |
Some updates on the latest commit are:
With the new update, there's a slight digression from the BenchmarksSome benchmark results with the updates are below: TimeSequential version: 205.397s The results were obtained on an Intel Xeon Gold 5120 processor with 24 isolated cores.
Note: There's still a dependency on |
This module is very promising (pun intended). I'll work on packaging: how to ship this module without introducing a hard dependence on OCaml.5.0. I think it'll just need to be a separate opam package |
Thanks a lot for the comments @raphael-proust! I'm doing another pass to update documentation and remove obsolete parts in the PR. I'll think through some of these open-ended questions and respond to them. |
It looks like there is a real error here in the CI:
Is this expected @Sudha247? |
Needed to bump domainslib to the latest version and change the argument to |
We need to define whether the Lwt.async exception hook is supported or not when on a different domain. |
I toyed around with replacing the Lwt_unix.condition signaling mechanism with a lower-level, no-Unix version, but I wasn't satisfied with the result. I think that we can move the Lwt_domain module outside of unix/ and into core/ but we might need
In my opinion, this can wait: we should focus on getting something that works and we can later get it to work better. |
The documentation about the exception is explicit. But to make sure the situation re: the async exception hook is clear, we could add a paragraph that explains what happens if we do We can add this later because it can be inferred from the current documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't depend on any lwt internals and could be just a useful separate small library until ocaml 5 is released?
After merging #894, you can rebase this PR to the latest master to actually test the multicore one with CI. |
That's great! I've reabsed the PR, it now includes CI check for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions.
Thinking about this design a bit more; I think we also need a convenience function: val Lwt.domain : (* or, detach/blocking/… *)
?pool:Lwt_domain.pool ->
('a -> 'b) ->
'a ->
'b Lwt.t In a typical application, we might want to run a blocking computation at various points in the code. It would become a hassle to set up a domain pool at the application’s top layer and thread that through to all lower layers which needed it. Lwt should set up a domain pool by default, and allow users to override it if necessary. We can add some docs to the effect that ‘promises are for non-blocking I/O but we have a convenient way to represent blocking computations as promises’. |
It sure is a hassle, especially while modifying existing modules. This is why named pools were introduced in domainslib.0.3.1. Pools can be obtained at a later point with the help of an identifier string it is associated to, thereby eliminating the need to thread through the pool across layers.
We were doing this earlier. But it could turn out problematic inter-operating with multiple Multicore enabled libraries, in the sense that it gets harder tracking down all the live domain pools. Ideally in Multicore OCaml (and OCaml 5.0) programs; number of domains = number of cores, exceeding that will introduce overheads and could slow down the program. If at all we do end up with a default domain pool, I think the right place for that to reside would be at domainslib.
This sounds like a good idea. I think I'm not clear about how this convenience function would be useful, it looks like it's exposing |
Thanks to @raphael-proust |
Thanks for the explanation. I am now understanding a couple of new things. Btw let me preface this comment by saying that, if this
OK, that is convenient, and then my suggestion becomes: val Lwt.domain : (* or, detach/blocking/… *)
string ->
('a -> 'b) ->
'a ->
'b Lwt.t
That's a bummer.
Fair enough.
I am suggesting that in most cases users should not have to think about (* main.ml *)
let main () =
let _ = Lwt_domain.setup_pool ~name:"main" 3 in
...
let () = main ()
(* person.ml *)
type t = {
id : int;
name : string;
} [@@deriving of_yojson]
let of_yojson = Lwt.domain "main" of_yojson
(* val of_yojson : Yojson.Safe.t -> (t, string) Lwt_result.t *) Now just by wrapping an existing CPU-bound function with |
- run: opam install lwt_luv lwt_react lwt --deps-only --with-test | ||
if: ${{ matrix.ppx == false }} | ||
|
||
- run: opam exec -- make build | ||
- run: opam install lwt_ppx --deps-only --with-test | ||
if: ${{ matrix.ppx == true }} | ||
|
||
- run: opam install lwt_domain --deps-only --with-test | ||
if: ${{ matrix.domain == true }} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern (default action for all workflows, ppx action if ppx=true, domain action if domain=true) is repeated below for building and then for testing. It is somewhat verbose and not very scalable (at some point we might want to add a unix flag).
It might be better to use the opam-local-packages
variable. Somewhat annoyingly, opam
and dune
expect the package list in different formats so it's not completely trivial. Still we could possibly simplify this file.
Opinions? Ideas?
ping @smorimoto who wrote this file.
…for Invalid_argument & rejection when detach raises an exception
Co-authored-by: ygrek <ygrek@autistici.org>
Co-authored-by: yawaramin <yawar.amin@gmail.com>
0b4af09
to
c7bc30a
Compare
rebased on master to resolve merge conflict |
Synopsis: The PR aims to add a module
Lwt_domain
that enables offloading computations to many CPU cores via Multicore's Domains.Lwt_domain
interfaceThis patch aims to add a new module
Lwt_domain
. The API is similar toLwt_preemptive
, the major difference beingLwt_preemptive
runs compuatations in a new systhread andLwt_domain
computations in a new domain.In effect, the
Lwt_domain
runs OCaml computations in parallel on multiple CPU cores, which is not the case withLwt_preemptive
. This is because system threads are run on a single CPU core despite being run concurrently. On the other hand, multicore domains run on the available cores.Implementation
A pool of domains is created by the
simple_init
function, with the default number of domains being4
. The created domains will be alive and waiting for tasks till the time main loop is alive. The number of live domains can be adjusted via theset_bounds
function.Unlike
Lwt_preemptive
which creates new systhreads frequently, all live domains are reused here because repeatedly creating and shutting down new domains is an expensive process, which could be avoided.When
detach
function is called it obtains a new worker from the pool of domains and the function is executed in that worker domain. After the execution is done, the worker domain will again be ready to take tasks.A separate queue of jobs is maintained that are supposed to be executed on the main domain, being called from
run_in_main
.Benchmark
A simple server that returns Fibonacci number of a given input was ported to run in parallel. The server and client code can be found here. The benchmark was run with 10 clients and 10 requests per client, it's time is shown below. Individual timings can be found in the blog post mentioned below.
Time graph of parallel Fibonacci server
Side note
Unix.fork
is not compatible with multiple domains. Hence, the test-suite is modified to run sequentially and all the tests that runUnix.fork
are run before theLwt_domain
test.This module is intended to work with OCaml 5.0 with parallelism support. It can be used now with Multicore OCaml obtained from multicore-opam. Any suggestions/feedback welcome, benchmarks from potential applications of this module will be appreciated too!
Further reading