Skip to content

Commit

Permalink
Merge pull request ocaml-multicore#707 from talex5/executor-pool-docs
Browse files Browse the repository at this point in the history
Executor pool docs
  • Loading branch information
talex5 authored Mar 10, 2024
2 parents ed9c4a5 + feb8d11 commit ce30c9a
Show file tree
Hide file tree
Showing 5 changed files with 1,610 additions and 13 deletions.
90 changes: 87 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ Eio replaces existing concurrency libraries such as Lwt
* [Running processes](#running-processes)
* [Time](#time)
* [Multicore Support](#multicore-support)
* [Domain Manager](#domain-manager)
* [Executor Pool](#executor-pool)
* [Synchronisation Tools](#synchronisation-tools)
* [Promises](#promises)
* [Example: Concurrent Cache](#example-concurrent-cache)
Expand Down Expand Up @@ -936,7 +938,12 @@ The mock backend provides a mock clock that advances automatically where there i

OCaml allows a program to create multiple *domains* in which to run code, allowing multiple CPUs to be used at once.
Fibers are scheduled cooperatively within a single domain, but fibers in different domains run in parallel.
This is useful to perform CPU-intensive operations quickly.
This is useful to perform CPU-intensive operations quickly
(though extra care needs to be taken when using multiple cores; see the [Multicore Guide](./doc/multicore.md) for details).

### Domain Manager

[Eio.Domain_manager][] provides a basic API for spawning domains.
For example, let's say we have a CPU intensive task:

```ocaml
Expand All @@ -950,7 +957,7 @@ let sum_to n =
!total
```

We can use [Eio.Domain_manager][] to run this in a separate domain:
We can use the domain manager to run this in a separate domain:

```ocaml
let main ~domain_mgr =
Expand All @@ -977,6 +984,10 @@ let main ~domain_mgr =
- : unit = ()
```

<p align='center'>
<img src="./doc/traces/multicore-posix.svg"/>
</p>

Notes:

- `traceln` can be used safely from multiple domains.
Expand All @@ -988,8 +999,78 @@ Notes:
- `Domain_manager.run` waits for the domain to finish, but it allows other fibers to run while waiting.
This is why we use `Fiber.both` to create multiple fibers.

For more information, see the [Multicore Guide](./doc/multicore.md).
### Executor Pool

An [Eio.Executor_pool][] distributes jobs among a pool of domain workers.
Domains are reused and can execute multiple jobs concurrently.

Each domain worker starts new jobs until the total `~weight` of its running jobs reaches `1.0`.
The `~weight` represents the expected proportion of a CPU core that the job will take up.
Jobs are queued up if they cannot be started immediately due to all domain workers being busy (`>= 1.0`).

This is the recommended way of leveraging OCaml 5's multicore capabilities.

Usually you will only want one pool for an entire application, so the pool is typically created when the application starts:

<!-- $MDX skip -->
```ocaml
let () =
Eio_main.run @@ fun env ->
Switch.run @@ fun sw ->
let pool =
Eio.Executor_pool.create
~sw (Eio.Stdenv.domain_mgr env)
~domain_count:4
in
main ~pool
```

The pool starts its domain workers immediately upon creation.

The pool will not block our switch `sw` from completing;
when the switch finishes, all domain workers and running jobs are cancelled.

`~domain_count` is the number of domain workers to create.
The total number of domains should not exceed `Domain.recommended_domain_count` or the number of cores on your system.

We can run the previous example using an Executor Pool like this:

```ocaml
let main ~domain_mgr =
Switch.run @@ fun sw ->
let pool =
Eio.Executor_pool.create ~sw domain_mgr ~domain_count:4
in
let test n =
traceln "sum 1..%d = %d" n
(Eio.Executor_pool.submit_exn pool ~weight:1.0
(fun () -> sum_to n))
in
Fiber.both
(fun () -> test 100000)
(fun () -> test 50000)
```

<!-- $MDX non-deterministic=output -->
```ocaml
# Eio_main.run @@ fun env ->
main ~domain_mgr:(Eio.Stdenv.domain_mgr env);;
+Starting CPU-intensive task...
+Starting CPU-intensive task...
+Finished
+sum 1..50000 = 1250025000
+Finished
+sum 1..100000 = 5000050000
- : unit = ()
```
`~weight` is the anticipated proportion of a CPU core used by the job.
In other words, the fraction of time actively spent executing OCaml code, not just waiting for I/O or system calls.
In the above code snippet we use `~weight:1.0` because the job is entirely CPU-bound: it never waits for I/O or other syscalls.
`~weight` must be `>= 0.0` and `<= 1.0`.
Example: given an IO-bound job that averages 2% of one CPU core, pass `~weight:0.02`.

Each domain worker starts new jobs until the total `~weight` of its running jobs reaches `1.0`.

## Synchronisation Tools

Eio provides several sub-modules for communicating between fibers,
Expand Down Expand Up @@ -1241,6 +1322,8 @@ The `Fiber.check ()` checks whether the worker itself has been cancelled, and ex
It's not actually necessary in this case,
because if we continue instead then the following `Stream.take` will perform the check anyway.

Note: in a real system, you would probably use [Eio.Executor_pool][] for this rather than making your own pool.

### Mutexes and Semaphores

Eio also provides `Mutex` and `Semaphore` sub-modules.
Expand Down Expand Up @@ -1805,6 +1888,7 @@ Some background about the effects system can be found in:
[Eio.Path]: https://ocaml-multicore.github.io/eio/eio/Eio/Path/index.html
[Eio.Time]: https://ocaml-multicore.github.io/eio/eio/Eio/Time/index.html
[Eio.Domain_manager]: https://ocaml-multicore.github.io/eio/eio/Eio/Domain_manager/index.html
[Eio.Executor_pool]: https://ocaml-multicore.github.io/eio/eio/Eio/Executor_pool/index.html
[Eio.Promise]: https://ocaml-multicore.github.io/eio/eio/Eio/Promise/index.html
[Eio.Stream]: https://ocaml-multicore.github.io/eio/eio/Eio/Stream/index.html
[Eio_posix]: https://ocaml-multicore.github.io/eio/eio_posix/Eio_posix/index.html
Expand Down
41 changes: 32 additions & 9 deletions doc/multicore.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,22 @@

* [Introduction](#introduction)
* [Problems with Multicore Programming](#problems-with-multicore-programming)
* [Optimisation 1: Caching](#optimisation-1-caching)
* [Optimisation 2: Out-of-Order Execution](#optimisation-2-out-of-order-execution)
* [Optimisation 3: Compiler Optimisations](#optimisation-3-compiler-optimisations)
* [Optimisation 4: Multiple Cores](#optimisation-4-multiple-cores)
* [Optimisation 1: Caching](#optimisation-1-caching)
* [Optimisation 2: Out-of-Order Execution](#optimisation-2-out-of-order-execution)
* [Optimisation 3: Compiler Optimisations](#optimisation-3-compiler-optimisations)
* [Optimisation 4: Multiple Cores](#optimisation-4-multiple-cores)
* [The OCaml Memory Model](#the-ocaml-memory-model)
* [Atomic Locations](#atomic-locations)
* [Initialisation](#initialisation)
* [Guidelines](#guidelines)
* [Atomic Locations](#atomic-locations)
* [Initialisation](#initialisation)
* [Safety Guidelines](#safety-guidelines)
* [Performance Guidelines](#performance-guidelines)
* [Further Reading](#further-reading)

<!-- vim-markdown-toc -->

## Introduction

OCaml 5.00 adds support for using multiple CPU cores in a single OCaml process.
OCaml 5.0 adds support for using multiple CPU cores in a single OCaml process.
An OCaml process is made up of one or more *domains*, and
the operating system can run each domain on a different core, so that they run in parallel.
This can make programs run much faster, but also introduces new ways for programs to go wrong.
Expand Down Expand Up @@ -446,7 +447,7 @@ So it will always see a correct list:
- : unit = ()
```

## Guidelines
## Safety Guidelines

It's important to understand the above to avoid writing incorrect code,
but there are several general principles that avoid most problems:
Expand Down Expand Up @@ -502,6 +503,28 @@ Finally, note that OCaml remains type-safe even with multiple domains.
For example, accessing a `Queue` in parallel from multiple domains may result in a corrupted queue,
but it won't cause a segfault.

## Performance Guidelines

The following recommendations will help you extract as much performance as possible from your hardware:

- There's a certain overhead associated with placing execution onto another domain,
but that overhead will be paid off quickly if your job takes at least a few milliseconds to complete.
Jobs that complete under 2-5ms may not be worth running on a separate domain.
- Similarly, jobs that are 100% I/O-bound may not be worth running on a separate domain.
The small initial overhead is simply never recouped.
- If your program never hits 100% CPU usage, it's unlikely that parallelizing it will improve performance.
- Try to avoid reading or writing to memory that's modified by other domains after the start of your job.
Ideally, your jobs shouldn't need to interact with other domains' "working data".
Aim to make your jobs as independent as possible.
If unavoidable, the [Saturn](https://github.com/ocaml-multicore/saturn) library offers a collection of efficient threadsafe data structures.
- It's often easier to design code to be multithreading friendly from the start
(by making longer, independent jobs) than by refactoring existing code.
- There's a cost associated with creating a domain, so try to use the same domains for longer periods of time.
`Eio.Executor_pool` takes care of this automatically.
- Obviously, reuse the same executor pool whenever possible! Don't recreate it over and over.
- Having a large number of domains active at the same time imposes additional overhead on
both the OS scheduler and the OCaml runtime, even if those domains are idle.

## Further Reading

- [OCaml Memory Model][] describes the full details of the memory model.
Expand Down
2 changes: 1 addition & 1 deletion doc/traces/Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
all: both-posix.svg cancel-posix.svg switch-mock.svg net-posix.svg
all: both-posix.svg cancel-posix.svg switch-mock.svg net-posix.svg multicore-posix.svg

%.svg: %.fxt
eio-trace render "$<"
Binary file added doc/traces/multicore-posix.fxt
Binary file not shown.
Loading

0 comments on commit ce30c9a

Please sign in to comment.