This proposal tries to address two long standing gaps which separate Dart from other more low-level languages:
-
Dart developers should be able to utilize multicore capabilities without hitting limitations of the isolate model.
-
Interoperability between Dart and native platforms (C/C++, Objective-C/Swift, Java/Kotlin) requires aligning concurrency models between Dart world and native world. The misalignment is not an issue when invoking simple native synchronous APIs from Dart, but it becomes a blocker when:
-
working with APIs pinned to a specific thread (e.g.
@MainActor
APIs in Swift or UI thread APIs on Android) -
dealing with APIs which want to call back from native into Dart on an arbitrary thread.
-
using Dart as a container for shared business logic - which can be invoked on an arbitrary thread by surrounding native application.
Native code does not understand the concept of isolates.
-
See What issues are we trying to solve? for concrete code examples.
Note
Improving Dart's interoperability with native code and its multicore
capabilities not only benefits Dart developers but also unlocks improvements
in the Dart SDK. For example, we will be able to move dart:io
implementation
from C++ into Dart and later split it into a package.
This proposal:
-
allows developer to selectively break isolation boundary between isolates by declaring some static fields to be shared between isolates within isolate group.
-
introduces the concept of shared isolate, an isolate which only has access to a state shared between all isolates of the group. This concept allows to bridge interoperability gap with native code. Shared isolate becomes an answer to the previously unresolved question "what if native code wants to call back into Dart from an arbitrary thread, which isolate does the Dart code run in?".
Note
Earlier version of this proposal was also introducing the concept of
shareable data based on a marker interface (Shareable
) which
required developer to explicitly opt in into shared memory multithreading
for their classes. In this model only instances of classes which implement
the marker interface could be shared between isolates.
Based on the extensive discussions with the language team and implementors I have arrived to the conclusion that this separation does not have clear benefits which are worth the associated implementation complexity. Consequently I remove this concept from the proposal and instead propose that we eventually allow unrestricted share everything multithreading within the isolate group.
Additionally the proposal tries to suggest a number of API changes to various core libraries which are essentially to making Dart a good multithreaded language. Some of this proposals, like adding atomics, are fairly straightforward and non-controversial, others, like coroutines, are included to show the extent of possible changes and to provoke thought.
The Implementation Roadmap section of the proposal tries to suggest a possible way forward with validating some of the possible benefits of the proposal without committing to specific major changes in the Dart language.
When using isolates it is relatively straightforward to parallelize two types of workloads:
- The output is a function of input without any significant dependency on other
state and the input is cheap to send to another isolate. In this case
developer can use
Isolate.run
to off load the computation to another isolate without paying significant costs for the transfer of data. - The computation that is self contained and runs in background producing outputs that are cheap to send to another isolate. In this case a persistent isolate can be spawned and stream data to the spawner.
Anything else hits the problem that transferring data between isolates is asynchronous and incurs copying costs which are linear in the size of transferred data.
Consider for example a front-end for Dart language which tries to parse a large Dart program. It is possible to parallelize parsing of strongly connected components in the import graph, however you can't fully avoid serialization costs - because resulting ASTs can't be directly shared between isolates. Similar example is parallelizing loading of ASTs from a large Kernel binary.
Note
Users can create data structures outside of the Dart heap using dart:ffi
to
allocate native memory and view it as typed arrays and structs. However,
adopting such data representation in an existing Dart program is costly and
comes with memory management challenges characteristic of low-level
programming languages like C. That's why we would like to enable users to
share data without requiring them to manually manage lifetime of complicated
object graphs.
It is worth highlighting shared memory multithreading does not necessarily imply simultaneous access to mutable data. Developers can still structure their parallel code using isolates and message passing - but they can avoid the cost of copying the data by sending the message which can be directly shared with the receiver rather than copied.
Consider the following C code using a miniaudio library:
static void DataCallback(
ma_device* device, void* out, const void* in, ma_uint32 frame_count) {
// Synchronously process up to |frame_count| frames.
}
ma_device_config config = ma_device_config_init(ma_device_type_playback);
// This function will be called when miniaudio needs more data.
// The call will happend on a backend specific thread dedicated to
// audio playback.
config.dataCallback = &DataCallback;
// ...
Porting this code to Dart using dart:ffi
is currently impossible, as FFI only
supports two specific callback types:
NativeCallable.isolateLocal
: native caller must have an exclusive access to an isolate in which callback was created. This type of callback works if Dart calls C and C calls back into Dart synchronously. It also works if caller uses VM C API for entering isolates (e.g.Dart_EnterIsolate
/Dart_ExitIsolate
).NativeCallable.listener
: native caller effectively sends a message to the isolate which created the callback and does not synchronously wait for the response.
Neither of these work for this use-case where native caller wants to perform a synchronous invocation. There are obvious ways to address this:
-
Create a variation of
NativeCallable.isolateLocal
which enters (and after call leaves) the target isolate if the native caller is not already in the target isolate. -
Create
NativeCallable.onTemporaryIsolate
which spawns (and after call destroys) a temporary isolate to handle the call.
Neither of these are truly satisfactory:
- Allowing
isolateLocal
to enter target isolate means that it can block the caller if the target isolate is busy doing something (e.g. processing some events) in parallel. This is unacceptable in situations when the caller is latency sensitive e.g. audio thread or even just the main UI thread of an application. - Using temporary isolate comes with a bunch of ergonomic problems as every
invocation is handled using a freshly created environment and no static state
is carried between invocations - which might surprise the developer. Sharing
mutable data between invocations requires storing it outside of Dart heap
using
dart:ffi
.
Note
This particular example might not look entirely convincing because it can be reasonably well solved within confines of isolate model.
When you look at an isolate as a bag of state guarded by a mutex, you eventually realize that this bag is simply way too big - it encompasses the static state of the whole program - and this is what makes isolates unhandy to use. The rule of thumb is that coarse locks lead to scalability problems.
How do you solve it?
Spawn an isolate that does one specific small task. In the context of this example:
- One isolate (consumer) is spawned to do nothing but synchronously handle
DataCallback
calls (using an extension ofisolateLocal
which enters and leaves isolate is required). - Another isolate (producer) is responsible for generating the data which is fed to the audio-library. The data is allocated in the native heap and directly shared with consumer.
However, isolates don't facilitate this style of programming. They are too
coarse - so it is easy to make a mistake, touch a static state you are not
supposed to touch, call a dependency which schedules an asynchronous task,
etc. Furthermore, you do still need a low overhead communication channel
between isolates. The shared memory is still part of the solution here, even
though in this particular example we can manage with what dart:ffi
allows
us. And that is, in my opinion, a pretty strong signal in favor of more shared
memory support in the language.
Another variation of this problem occurs when trying to use Dart for sharing
business logic and creating shared libraries. Imagine that dart:ffi
provided a
way to export static functions as C symbols:
// foo.dart
import 'dart:ffi' as ffi;
// See https://dartbug.com/51383 for discussion of [ffi.Export] feature.
@ffi.Export()
void foo() {
}
Compiling this produces a shared library exporting a symbol with C signature:
// foo.h
extern "C" void foo();
The native code loads shared library and calls this symbol to invoke Dart code. Would not this be great?
Unfortunately currently there is no satisfactory way to define what happens when
native code calls this exported symbol as the execution of foo
is only
meaningful within a specific isolate. Should foo
create an isolate lazily?
Should there be a single isolate or multiple isolates? What happens if foo
is
called concurrently from different threads? When should this isolate be
destroyed?
These are all questions without satisfactory answers due to misalignment in execution modes between the native caller and Dart.
Finally, the variation of the interop problem exists in an opposite direction: invoking a native API from Dart on a specific thread. Consider the following code for displaying a file open dialog on Mac OS X:
NSOpenPanel* panel = [NSOpenPanel openPanel];
// Open the panel and return. When user selects a file
// the passed block will be invoked.
[panel beginWithCompletionHandler: ^(NSInteger result){
// Handle the result.
}];
Trying to port this code to Dart hits the following issue: you can only use this API on the UI thread and Dart's main isolate is not running on the UI thread. Workarounds similar to discussed before can be applied here as well. You wrap a piece of Dart code you want to call on a specific thread into a function and then:
- Send Dart
isolateLocal
callback to be executed on the specific thread, but make it enter (and leave) the target isolate. - Create an isolate specific to the target thread (e.g. special platform isolate for running on main platform thread) and have callbacks to be run in that isolate.
However the issues described above equally apply here: you either hit a problem with stalling the caller by waiting to acquire an exclusive access to an isolate or you hit a problem with ergonomics around the lack of shared state.
See go/dart-interop-native-threading and go/dart-platform-thread for more details around the challenge of crossing isolate-to-thread chasm and why all different solutions fall short.
Before we discuss our proposal for Dart it is worth look at what other popular and niche languages do around share memory multithreading. If you feel familiar with the space feel free to skip to Shared Isolate section.
C/C++, Java, Scala, Kotlin, C# all have what I would call an unrestricted shared memory multithreading:
- objects can be accessed (read and written to) from multiple threads at once
- static state is shared between all threads
- you can spawn new threads using core APIs, which are part of the language
- you can execute code on any thread, even when the thread was spawned externally: thread spawned by Java can execute C code and the other way around thread spawned by C can execute Java code (after a little dance of attaching Java VM to a thread).
Python and Ruby, despite being scripting languages, both provide similar
capabilities around multithreading as well (see Thread
in Ruby
and threading
library in Python). The following Ruby program will spawn 10
threads which all update the same global variable:
count = 0
threads = 5.times.map do |i|
puts "T#{i}: starting"
Thread.new do
count += 1
puts "T#{i}: done"
end
end
threads.each { |t| t.join }
puts "Counted to #{count}"
$ ruby test.rb
T0: starting
T1: starting
T2: starting
T3: starting
T4: starting
T1: done
T4: done
T0: done
T2: done
T3: done
Counted to 5
Concurrency in both languages is severely limited by a global lock which protects interpreter's integrity. This lock is known Global Interpreter Lock (GIL) in Python and Global VM Lock (GVL) in Ruby. GIL/GVL ensures that an interpreter is only running on one thread at a time. Scheduling mechanisms built into the interpreter allow it to switch between threads giving each a chance to run concurrently. This means executions of Python/Ruby code on different threads are interleaved, but serialized. You can observe non-atomic behaviors and data races (the VM will not crash though), but you can't utilize multicore capabilities. CPython developers are actively exploring the possibility to remove the GIL see PEP 703.
Erlang is a functional programming language for creating highly concurrent distributed systems which represents another extreme: no shared memory multithreading or low-level threading primitives at all. Design principles behind Erlang are summarized in Joe Armstrong's PhD thesis Making reliable distributed systems in the presence of software errors. Isolation between lightweight processes which form a running application is the idea at the very core of Erlang's design, to quote section 2.4.3 of the thesis:
The notion of isolation is central to understanding COP, and to the construction of fault-tolerant software. Two processes operating on the same machine must be as independent as if they ran on physically separated machines
...
Isolation has several consequences:
- Processes have “share nothing” semantics. This is obvious since they are imagined to run on physically separated machines.
- Message passing is the only way to pass data between processes. Again since nothing is shared this is the only means possible to exchange data.
- Isolation implies that message passing is asynchronous. If process communication is synchronous then a software error in the receiver of a message could indefinitely block the sender of the message destroying the property of isolation.
- Since nothing is shared, everything necessary to perform a distributed computation must be copied. Since nothing is shared, and the only way to communicate between processes is by message passing, then we will never know if our messages arrive (remember we said that message passing is inherently unreliable.) The only way to know if a message has been correctly sent is to send a confirmation message back.
JavaScript is a variation of communicating event-loops model and its
capabilities clearly both inspired and defined capabilities of Dart's own
isolate model. An isolated JavaScript environment allows only for a single
thread of execution, but multiple such environments
(workers) can be spawned in parallel. These workers share
no object state and communicate via message passing which copies the data sent
with an exception of transferrable objects . Recent versions of
JavaScript poked a hole in the isolation boundary by introducing
SharedArrayBuffer
: allowing developers to share unstructured blobs of
memory between workers.
Go concurrency model can be seen as an implementation of
communicating sequential processes formalism proposed by Hoare. Go
applications are collections of communicating goroutines, lightweight threads
managed by Go runtime. These are somewhat similar to Erlang processes,
but are not isolated from each other and instead execute inside a shared memory
space. Goroutines communicate using message passing through channels. Go
does not prevent developer from employing shared memory and provides a number of
classical synchronization primitives like Mutex
, but heavily discourages this.
Effective Go contains the following slogan:
Do not communicate by sharing memory; instead, share memory by communicating.
Note
It is worth pointing out that managed languages which try to hide shared memory (isolated environments of JavaScript) and languages which try to hide threading (Go, JavaScript, Erlang) are bound to have difficulties communicating with languages which don't hide these things. These differences create an impedance mismatch between native caller and managed callee or the other way around. This is similar to what Dart is experiencing.
Rust gives developers access to shared memory multithreading, but leans onto ownership expressed through its type system to avoid common programming pitfalls. See Fearless Concurrency for an overview. Rust provides developers with tools to structure their code both using shared-state concurrency and message passing concurrency. Rust type system makes it possible to express ownership transfer associated with message passing, which means the message does not need to be copied to avoid accidental sharing.
Rust is not alone in using its type system to eliminate data races. Another example is Pony and its reference capabilities.
Swift does not fully hide threads and shared memory multi-threading, but it provides high-level concurrency abstractions tasks and actors on top of low-level mechanisms (see Swift concurrency for details). Swift actors provide a built-in mechanism to serialize access to mutable state: each actor comes with an executor which runs tasks accessing actor's state. External code must use asynchronous calls to access the actor:
actor A {
var data: [Int]
func add(value: Int) -> Int {
// Code inside an actor is fully synchronous because
// it has exclusive access to the actor.
data.append(value)
return data.count
}
}
let actor : A
func updateActor() async {
// Code outside of the actor is asynchronous. The actor
// might be busy so we might need to suspend and wait
// for the reply.
let count = await actor.add(10)
}
Swift is moving towards enforcing "isolation" between concurrency domains: a
value can only be shared across the boundary (e.g. from one actor to another) if
it conforms to a Sendable
protocol. Compiler is capable of validating
obvious cases: a deeply immutable type or a value type which will be copied
across the boundary are both obviously Sendable
. For more complicated cases,
which can't be checked automatically, developers have an escape hatch of simply
declaring their type as conformant by writing @unchecked Sendable
which
disables compiler enforcement. Hence my choice of putting isolation in quotes.
OCaml is multi-paradigm programming language from the ML family of programming languages. Prior to version 5.0 OCaml relied on a global runtime lock which serialized access to the runtime from different threads meaning that only a single thread could run OCaml code at a time - putting OCaml into the same category as Python/Ruby with their GIL/GVL. However in 2022 after 8 years of work OCaml 5.0 brought multicore capabilities to OCaml. The OCaml Multicore project was seemingly focused on two things:
- Modernizing runtime system and GC in particular to support multiple threads of execution using runtime in parallel (see Retrofitting Parallelism onto OCaml).
- Incorporating effect handlers into the OCaml runtime system as a
generic mechanism on top of which more concrete concurrency mechanisms (e.g.
lightweight threads, coroutines,
async/await
etc) could be implemented (see Retrofitting Effect Handlers onto OCaml).
Unit of parallelism in OCaml is a domain - it's an OS thread plus some associated runtime structures (e.g. thread local allocation buffer). Domains are not isolated from each other: they allocate objects in a global heap, which is shared between all domains and can access and mutate shared global state.
Normally each isolate gets its own fresh copy of all static fields. If one
isolate changes one of the fields no other isolate can observe this change.
I propose to punch a hole in this boundary by allowing programmer to opt out
of this isolation: a field marked as shared
will be shared between all
isolates. Changing a field in one isolate can be observed from another isolate.
Shared fields should guarantee atomic initialization: if multiple threads access the same uninitialized field then only one thread will invoke the initializer and initialize the field, all other threads will block until initialization it complete.
In the shared everything multithreading shared fields can be allowed to
contain anything - including instances of mutable Dart classes. However,
initially I propose to limit shared fields by allowing only trivially shareable
types. These types are those which already can pass through
SendPort
without copying:
- strings;
- numbers;
- deeply immutable types;
- builtin implementations of
SendPort
andTypedData
; - tear-offs of static methods;
- closures which capture variables of trivially shareable types;
Sharing of these types don't break isolate boundaries.
Note
It might seem strange to include mutable types like TypedData
into trivially
shareable, but in reality allowing to share these type does not actually
introduce any fundamentally new capabilities. A TypedData
instance can
already be backed by native memory and as such shared between two
isolates.
Note
Types like SendPort
are not final
so strictly speaking we can't make a
decision whether an instance of SendPort
is trivially shareable or not
based on the static type alone. Instead we must dynamically check if
SendPort
is an internal implementation or not. Similar tweak should probably
be applied to the specification of @pragma('vm:deeply-immutable')
allowing classes containing SendPort
fields to be marked deeply-immutable
at the cost of introducing additional runtime checks when the object is created.
Caution
Shared field reads and writes are atomic for reference types, but other than that there are no implicit synchronization, locking or strong memory barriers associated with shared fields. Possible executions in terms of observed values will be specified by the Dart's memory model which I propose to model after JavaScript's and Go's: program which is free of data races will execute in a sequentially consistent manner.
Furthermore, shared fields of int
and double
types are allowed to exhibit
tearing on 32-bit platforms.
Lets take another look at the following example:
int global = 0;
void main() async {
global = 42;
await Isolate.run(() {
print(global); // => 0
global = 24;
});
print(global); // => 42
}
Stripped to the bare minimum the example does not seem to behave in a confusing
way: it seems obvious that each isolate has its own version of global
variable
and mutations of global
are not visible across isolates. However, in the real
world code such behavior might be hidden deep inside a third party dependency
and thus much harder to detect and understand. This behavior also makes
interoperability with native code more awkward than it ought to be: calling Dart
requires an isolate, something that native code does not really know or care
about. Consider for example the following code:
int global;
@pragma('vm:entry-point')
int foo() => global++;
The result of calling foo
from the native side depends on which isolate the
call occurs in.
shared
global variables allow developers to tackle this problem - but hidden
dependency on global state might introduce hard to diagnose and debug bugs.
I propose to tackle this problem by introducing the concept of shared isolate:
code running in a shared isolate can only access shared
state and not any
of isolated state, an attempt to access isolated state results in a dynamic
IsolationError
.
// dart:isolate
class Isolate {
/// Run the given function [f] in the _shared isolate_.
///
/// Shared isolate contains a copy of the
/// global `shared` state of the current isolate and does not have any
/// non-`shared` state of its own. An attempt to access non-`shared` static variable throws [IsolationError].
external static Future<S> runShared<S>(S Function() task);
}
int global = 0;
shared int sharedGlobal = 0;
void main() async {
global = 42;
sharedGlobal = 42;
await Isolate.runShared(() {
print(global); // IsolationError: Can't access 'global' when running in shared isolate
global = 24; // IsolationError: Can't access 'global' when running in shared isolate
print(sharedGlobal); // => 42
sharedGlobal = 24;
});
print(global); // => 42
print(sharedGlobal); // => 24
}
Note
It is tempting to try introducing a compile time separation between functions
which only access shared
state and functions which can access isolated state.
However an attempt to fit such separation into the existing language requires
significant and complex language changes: type system would need capabilities
to express which functions touch isolate state and which only touch shared
state. Things will get especially complicated around higher-order functions
like those on List
.
Introduction of shared isolate allows to finally address the problem of native
code invoking Dart callbacks from arbitrary threads.
NativeCallable
can be extended with the corresponding constructor:
class NativeCallable<T extends Function> {
/// Constructs a [NativeCallable] that can be invoked from any thread.
///
/// When the native code invokes the function [nativeFunction], the
/// corresponding [callback] will be synchronously executed on the same
/// thread within a shared isolate corresponding to the current isolate group.
///
/// Throws [ArgumentError] if [callback] captures state which can't be
/// transferred to shared isolate without copying.
external factory NativeCallable.shared(
@DartRepresentationOf("T") Function callback,
{Object? exceptionalReturn});
}
The function pointer returned by NativeCallable.shared(...).nativeFunction
will be bound to an isolate group which produced it using the same trampoline
mechanism FFI uses to create function pointers from closures. Returned function
pointer can be called by native code from any thread. It does not require
exclusive access to a specific isolate and thus avoids interoperability pitfalls
associated with that:
- No need to block native caller and wait for the target isolate to become available.
- Clear semantics of globals:
shared
global state is accessible and independent from the current thread;- accessing non-
shared
state will throw anIsolationError
.
In shared everything multithreading world callback
can be allowed to
capture arbitrary state, however in shared native memory multithreading
this state has to be restricted to trivially shareable types:
// This code is okay because `int` is trivially shareable.
int counter = 0;
NativeCallable.shared(() {
counter++;
});
// This code is not okay because `List<T>` is not trivially shareable.
List<int> list = [];
NativeCallable.shared(() {
list.add(1);
});
An introduction of shared isolate allows us to adjust our deployment story and make it simpler for native code to link, either statically or dynamically, to Dart code.
Note
Below when I say native library I mean a static library or shared object produced from Dart code using an AOT compiler. Such native library can be linked with the rest of the native code in the application either statically at build time or dynamically at runtime using appropriate native linkers provided by the native toolchain or the OS. The goal here is that using Dart from a native application becomes indistinguishable from using a simple C library.
Consider for example previously given in the Interoperability section:
// foo.dart
import 'dart:ffi' as ffi;
// See https://dartbug.com/51383 for discussion of [ffi.Export] feature.
@ffi.Export()
void foo() {
}
which produces a native library exporting a C symbol:
// foo.h
extern "C" void foo();
Shared isolates give us a tool to define what happens when foo
is invoked by a
native caller:
- There is a 1-1 correspondence between the native library and an isolate group corresponding to this native library (e.g. there is a static variable somewhere in the library containing a pointer to the corresponding isolate group).
- When an exported symbol is invoked the call happens in the shared isolate of that isolate group.
Note
Precise mechanism managing isolate group's lifetime does not matter for the purposes of the document and belongs to the separate discussion.
Consider the following code:
Future<void> foo() async {
await something();
print(1);
}
void main() async {
await Isolate.runShared(() async {
await foo();
});
}
What happens when Future
completes in the shared isolate? Who drives event loop
of that isolate? Which thread will callbacks run on?
I propose to introduce another concept similar to Zone
: Executor
. Executors
encapsulate the notion of the event loop and control how tasks are executed.
abstract interface class Executor {
/// Current executor
static Executor get current;
Isolate get owner;
/// Schedules the given task to run in the given executor.
void schedule(void Function() task);
}
How a particular executor runs scheduled tasks depends on the executor itself: e.g. an executor can have a pool of threads or notify an embedder that it has tasks to run and let embedder run these tasks.
All built-in asynchronous primitives will make the following API guarantee: a
callback passed to Future
or Stream
APIs will be invoked using executor
which was running the code which registered the callback.
Note
There is a clear parallel between Executor
and Zone
: asynchronous
callbacks attached toStream
and Future
are bound to the current Zone
.
Original design suggested to treat Zone
as an executor - but this obscured
the core of the proposal, so current version splits this into a clear separate
concept of Executor
.
Structured concurrency is a way of structuring concurrent code where lifecycle of concurrent tasks has clear relationship to the control-flow structure of the code which spawned those tasks. One of the most important properties of structured concurrency is an ability to cancel pending subtasks and propagate the cancellation recursively.
Consider for example the following code:
Future<Result> doSomething() async {
final (a, b) = await (requestA(), computeB()).wait;
return combineIntoResult(a, b);
}
If Dart supported structured concurrency, then the following would be guaranteed:
- If either
requestA
orcomputeB
fails, then the other is canceled. doSomething
computation can be canceled by the holder of theFuture<Result>
and this cancellation will be propagated intorequestA
andcomputeB
.- If
computeB
throws an error beforerequestA
is awaited thenrequestA
still gets properly canceled.
Upgrading dart:async
capabilities in the wake of shared-memory multithreading
is also a good time to introduce some of the structured concurrency concepts
into the language. See
Cancellable Future
proposal for the details of how this could work in Dart.
See also:
dart:concurrent
will serve as a library hosting low-level concurrency
primitives.
Isolates are not threads even though they are often confused with ones. A code running within an isolate might be executing on a dedicated OS thread or it might running on a dedicated pool of threads. When expanding Dart's multithreading capabilities it seems reasonable to introduce more explicit ways to control threads.
// dart:concurrent
abstract class Thread {
/// Runs the given function in a new thread.
///
/// The function is run in the shared isolate, meaning that
/// it will not have access to the non-shared state.
///
/// The function will be run in a `Zone` which uses the
/// spawned thread as an executor for all callbacks: this
/// means the thread will remain alive as long as there is
/// a callback referencing it.
external static Thread start<T>(FutureOr<T> Function() main);
/// Current thread on which the execution occurs.
///
/// Note: Dart code is only guaranteed to be pinned to a specific OS thread
/// during a synchronous execution.
external static Thread get current;
external Future<void> join();
external void interrupt();
external set priority(ThreadPriority value);
external ThreadPriority get priority;
}
/// An [Executor] backed by a fixed size thread
/// pool and owned by the shared isolate of the
/// current isolate (see [Isolate.runShared]).
abstract class ThreadPool implements Executor {
external factory ThreadPool({required int concurrency});
}
Additionally I think providing a way to synchronously execute code in a specific isolate on the current thread might be useful:
class Isolate {
external T runSync<T>(T Function() cb);
}
Consider the following example:
Thread.start(() async {
var v = await foo();
var u = await bar();
});
Connecting execution / scheduling behavior to Zone
allows us to give a clear
semantics to this code: this code will run on a specific (newly spawned) thread
and will not change threads between suspending and resumptions.
AtomicRef<T>
is a wrapper around a value of type T
which can be updated
atomically. It can only be used with true reference types - an attempt to create
an AtomicRef<int>
, AtomicRef<double>
, AtomicRef<(T1, ..., Tn)>
will
throw. The reason from disallowing this types is to avoid implementation
complexity in compareAndSwap
which is defined in terms of identity
. We also
impose the restriction on compareAndSwap
.
Note
AtomicRef
uses method based load
/ store
API instead of simple
getter/setter API (i.e. abstract T value
) for two reasons:
- We want to align this API with that of extensions like
Int32ListAtomics
, which useatomicLoad
/atomicStore
naming - We want to keep a possibility to later extend these methods, e.g. add a named parameter which specifies particular memory ordering.
// dart:concurrent
final class AtomicRef<T> {
/// Creates an [AtomicRef] initialized with the given value.
///
/// Throws `ArgumentError` if `T` is a subtype of [num] or [Record].
external factory AtomicRef(T initialValue);
/// Atomically updates the current value to [desired].
///
/// The store has release memory order semantics.
external void store(T desired);
/// Atomically reads the current value.
///
/// The load has acquire memory order semantics.
external T load();
/// Atomically compares whether the current value is identical to
/// [expected] and if it is sets it to [desired] and returns
/// `(true, expected)`.
///
/// Otherwise the value is not changed and `(false, currentValue)` is
/// returned.
///
/// Throws argument error if `expected` is a instance of [int], [double] or
/// [Record].
external (bool, T) compareAndSwap(T expected, T desired);
}
final class AtomicInt32 {
external void store(int value);
external int load();
external (bool, int) compareAndSwap(int expected, int desired);
external int fetchAdd(int v);
external int fetchSub(int v);
external int fetchAnd(int v);
external int fetchOr(int v);
external int fetchXor(int v);
}
final class AtomicInt64 {
external void store(int value);
external int load();
external (bool, int) compareAndSwap(int expected, int desired);
external int fetchAdd(int v);
external int fetchSub(int v);
external int fetchAnd(int v);
external int fetchOr(int v);
external int fetchXor(int v);
}
extension Int32ListAtomics on Int32List {
external void atomicStore(int index, int value);
external int atomicLoad(int index);
external (bool, int) compareAndSwap(int index, int expected, int desired);
external int fetchAdd(int index, int v);
external int fetchSub(int index, int v);
external int fetchAnd(int index, int v);
external int fetchOr(int index, int v);
external int fetchXor(int index, int v);
}
extension Int64ListAtomics on Int64List {
external void atomicStore(int index, int value);
external int atomicLoad(int index);
external (bool, int) compareAndSwap(int index, int expected, int desired);
external int fetchAdd(int index, int v);
external int fetchSub(int index, int v);
external int fetchAnd(int index, int v);
external int fetchOr(int index, int v);
external int fetchXor(int index, int v);
}
// These extension methods will only work on fixed-length builtin
// List<T> type and will throw an error otherwise.
extension RefListAtomics<T> on List<T> {
external void atomicStore(int index, T value);
external T atomicLoad(int index);
external (bool, T) compareAndSwap(T expected, T desired);
}
At the bare minimum libraries should provide a non-reentrant Lock
and a
Condition
. However we might want to provide more complicated synchronization
primitives like re-entrant or reader-writer locks.
// dart:concurrent
// Non-reentrant Lock.
final class Lock {
external void acquireSync();
external bool tryAcquireSync({Duration? timeout});
external void release();
external Future<void> acquire();
external Future<bool> tryAcquire({Duration? timeout});
}
final class Condition {
external bool waitSync(Lock lock, {Duration? timeout});
external Future<bool> wait(Lock lock, {Duration? timeout});
external void notify();
external void notifyAll();
}
Note
Java has a number of features around synchronization:
- It allows any object to be used for synchronization purposes.
- It has convenient syntax for grabbing a monitor associated with an object:
synchronized (obj) { /* block */ }
. - It allows marking methods with
synchronized
keyword - which is more or less equivalent to wrapping method's body intosynchronized
block.
I don't think we want these features in Dart:
- Supporting synchronization on any object comes with severe implementation complexity.
- A closure based API
R withLock<R>(lock, R Function() body)
should provide a good enough alternative to special syntactic forms likesynchronized
. - An explicit locking in the body of the method is clearer than implicit locking introduced by an attribute.
Given that we are adding support for OS threads we should consider if we want to add support for coroutines (also known as fibers, or lightweight threads) as well.
abstract interface class Coroutine {
/// Return currently running coroutine if any.
external static Coroutine? get current;
/// Create a suspended coroutine which will execute the given
/// [body] when resumed.
external static Coroutine create(void Function() body);
/// Suspends the given currently running coroutine.
///
/// This makes `resume` return with
/// Expects resumer to pass back a value of type [R].
external static void suspend();
/// Resumes previously suspended coroutine.
///
/// If there is a coroutine currently running the suspends it
/// first.
external void resume();
/// Resumes previously suspended coroutine with exception.
external void resumeWithException(Object error, [StackTrace? st]);
}
Coroutines is a very powerful abstraction which allows to write straight-line code which depends on asynchronous values.
Future<String> request(String uri);
extension FutureSuspend<T> on Future<T> {
T get value {
final cor = Coroutine.current ?? throw 'Not on a coroutine';
late final T value;
this.then((v) {
value = v;
cor.resume();
}, onError: cor.resumeWithException);
cor.suspend();
return value;
}
}
List<String> requestAll(List<String> uris) =>
Future.wait(uris.map(request)).value;
SomeResult processUris(List<String> uris) {
final data = requestAll(uris);
// some processing of [data]
// ...
}
void main() {
final uris = [...];
Coroutine.create(() {
final result = processUris(uris);
print(result);
}).resume();
}
It might be useful to augment existing types like Future
, Stream
and
ReceivePort
with blocking APIs which would only be usable in shared isolate
and under condition that it is not going to block the executor's event loop.
dart:ffi
should expose atomic reads and writes for native memory.
extension Int32PointerAtomics on Pointer<Int32> {
external void atomicStore(int value);
external int atomicLoad();
external (bool, int) compareAndSwap(int expected, int desired);
external int fetchAdd(int v);
external int fetchSub(int v);
external int fetchAnd(int v);
external int fetchOr(int v);
external int fetchXor(int v);
}
extension IntPtrPointerAtomics on Pointer<IntPtr> {
external void atomicStore(int value);
external int atomicLoad();
external (bool, int) compareAndSwap(int expected, int desired);
external int fetchAdd(int v);
external int fetchSub(int v);
external int fetchAnd(int v);
external int fetchOr(int v);
external int fetchXor(int v);
}
extension PointerPointerAtomics<T> on Pointer<Pointer<T>> {
external void atomicStore(Pointer<T> value);
external Pointer<T> atomicLoad();
external (bool, Pointer<T>) compareAndSwap(Pointer<T> expected, Pointer<T> desired);
}
For convenience reasons we might also consider making the following work:
final class MyStruct extends Struct {
@Int32()
external final AtomicInt32 value;
}
The user is expected to use a.value.store(...)
and a.value.load(...
to
access the value.
Caution
Support for AtomicInt<N>
in FFI structs is meant to enable atomic access to
fields without requiring developers to go through Pointer
based atomic APIs.
It is not meant as a way to interoperate with structs that contain
std::atomic<int32_t>
(C++) or _Atomic int32_t
(C11) because these types
don't have a defined ABI.
We start by implementing shared isolates and allowing shared
global
fields (designated via @pragma('vm:shared')
rather than a keyword) of
trivially shareable types. We then expose shared isolates to FFI by introducing
NativeCallable.shared
and allowing to call into an isolate group from
an arbitrary thread.
These changes do not significantly change the shape of Dart programming language, they streamline the interoperability with native code but do not introduce any new fundamental capabilities: developers can already share native memory between isolates and that simply makes such sharing more convenient to use. There is no sharing of mutable Dart objects at this stage yet.
Consequently I feel that this set of features (shared native memory multithreading) can be shipped to Dart developers and that will significantly streamline out interoperability story.
Separately from this we will work on allowing to share arbitrary Dart objects under an experimental flag. And use these capabilities to prototype multicore based optimizations in either CFE or analyzer and assess the usability and the impact of the change.
Memory model describes the range of possible behaviors of multi-threaded programs which read and write shared memory. Programmer looks at the memory model to understand how their program will behave. Compiler engineer looks at the memory model to figure out which code transformations and optimization are valid. The table below provides an overview of memory models for some widely used languages.
Language | Memory Model |
---|---|
C# | Language specification itself (ECMA-334) does not describe any memory mode. Instead the memory model is given in Common Language Infrastructure (ECMA-335, section I.12.6 Memory model and optimizations). ECMA-335 memory model is relatively weak and CLR provides stronger guarantees documented here. See dotnet/runtime#63474 and dotnet/runtime#75790 for some additional context. |
JavaScript | Memory model is documented in ECMA-262 section 13.0 Memory Model. This memory model is fairly straightforward: it guarantees sequential consistency for atomic operations, while leaving other operations unordered. |
Java | Given in Java Language Specification (JLS) section 17.4 |
Kotlin | Not documented. Kotlin/JVM effectively inherits Java's memory model. Kotlin/Native - does not have a specified memory model, but likely follows JVM's one as well. |
C++ | Given in the section Multi-threaded executions and data races of the standard (since C++11). Notably very fine grained |
Rust | No official memory model according to reference, however it is documented to "blatantly inherit C++20 memory model" |
Swift | Defined to be consistent with C/C++. See SE-0282 |
Go | Given here: race free programs have sequential consistency, programs with races still have some non-deterministic but well-defined behavior. |
When expanding Dart's capabilities we need to consider if this semantic can be implemented across the platforms that Dart runs on.
No blockers to implement any multithreading behavior. VM already has a concept of isolate groups: multiple isolates concurrently sharing the same heap and runtime infrastructure (GC, program structure, debugger, etc).
No shared memory multithreading currently (beyond unstructured binary data
shared via SharedArrayBuffer
). However there is a Stage 1 TC-39 proposal
JavaScript Structs: Fixed Layout Objects and Some Synchronization Primitives
which introduces the concept of struct - fixed shape mutable object which can
be shared between different workers. Structs can't have any methods associated
with them. This makes structs unsuitable for representing arbitrary Dart
classes - which usually have methods associated with them.
Wasm GC does not have a well defined concurrency story, but a shared-everything-threads proposal is under design. This proposal seems expressive enough for us to be able to implement proposed semantics on top of it.
Note
Despite shared memory Wasm proposal has an issue which makes it challenging for Dart to adopt it:
- It prohibits sharable and non-shareable structs to be subtypes of each other.
- It prohibits
externref
inside shareable structs
If Dart introduces share memory multithreading it would need to mark
struct
type representing Object
as shared
, but this means Dart objects
can no longer directly contain externref
s inside them.
Assuming that Wasm is going to move forward with type based partitioning, we
would need to resolve this conundrum by employing some sort of thread local
wrapper, which can be implemented on top of TLS storage and WeakMap
.
When implementing dart:*
libraries we should keep racy access in mind.
Consider for example the following code:
class _GrowableList<T> implements List<T> {
int length;
final _Array storage;
T operator[](int index) {
RangeError.checkValidIndex(index, this, "index", this.length);
return unsafeCast(storage[index]); // (*)
}
T removeLast() {
final result = this[length];
storage[length--] = null;
return result;
}
}
This code is absolutely correct in single-threaded environment, but can cause
heap-safety violation in a racy program: if operator[]
races with removeLast
then storage[index]
might return null
even though checkValidIndex
succeeded.
The author thanks @aam @brianquinlan @dcharkes @kevmoo @liamappelbe @loic-sharma @lrhn @yjbanov for providing feedback on the proposal.