Idle callback duplication in the runtime #8340

toddaaro · 2013-08-06T16:43:42Z

The way the runtime is currently structured every time "something happens" a callback is registered with a libuv event loop that wakes up that loop to deal with it. This could be "work available to steal" or "message sent to scheduler" for example. The problem with these is that we create massive numbers of the things when we really only need one per scheduler. If the scheduler needs to have a callback registered, register the singleton callback for that scheduler. Doing this would eliminate huge amounts of allocation and many many calls out to libuv functions.

toddaaro · 2013-08-12T17:59:10Z

This is the design I have been thinking about over the weekend:

A SchedHandle stops holding a "new" callback to ping. Instead it includes a reference to a singleton idle callback in the scheduler struct.
The Scheduler struct gains a new field, an atomic boolean that represents the "pending callback" state of the scheduler.
The new path to ping the singleton callback first checks this atomic boolean. If there is already an idle callback pending the value will be "true" and the callback ping is skipped. If the value is false, it means that a callback ping needs to be sent. The flag is CASed to true, and the pinger that successfully does CAS does the callback registration.
The first step in run_sched_once is to set the flag to false, as there is no callback pending as the scheduler is active. If the scheduler decides to send itself a callback it uses the same path as (3).

This should be safe, as the CAS on the atomic boolean guarantees that only one thread uses the callback at a time, and the way a scheduler is destructed is by waiting until the handles are done, so the handle can't outlive the callback it points to.

One tricky situation is where a thread can decide to send a callback right as the run_sched_once function starts, observe the flag as 'true', and skip the callback. The scheduler could then swap the flag to 'false' "missing" that attempted callback registration. As long as the scheduler can send itself a callback when there is more work to do this should be a non-issue, but we need to make sure we can answer that question correctly.

I don't think the ABA problem will be an issue here. If it is, we can use an atomic uint instead and do n%2 to find a boolean value. The extra bits just provide a "count" of where we are so ABA issues go away.

This should be very performant, as the absolute minimum amount of work is done. No extra callbacks are registered, and no allocation is required to make a "new" callback. There is one atomic memory operation per callback attempt, one callback object, and callbacks are only pinged when there isn't already one pending.

The major design question is whether or not the flag is required. Libuv might be doing something equivalent internally, making this atomic op redundant. It does hold two roles though, both protecting the single callback object from concurrent use, and also eliminating "duplicate" callbacks. Libuv likely doesn't do both of these things, so it is very likely we should use the flag.

brson · 2013-08-12T18:07:54Z

My comments from IRC:

11:04 @brson toddaaro: this description seems to mix the idle and async callbacks, sugesting that the idle callback is registered from another thread, which can't happen.
SchedHandles use async callbacks to wake up the scheduler remotely, for which you might use atomics to short-circuit, and idle callbacks are for running the
scheduler locally
11:04 -!- ecr [Thunderbir@moz-BBE3ABD.mv.mozilla.com] has joined #rust
11:06 @brson toddaaro: there are two distinct optimizations to make here: 1) stop running the idle callback so much and 2) stop firing async callbacks when they aren't needed

toddaaro · 2013-08-12T18:51:46Z

Paraphrasing some IRC:
toddaaro: while distinct, they have roughly the same purpose so it would be nice to optimize them together somehow
brson: adding atomic synchronization cost to idle callbacks would be pretty bad
toddaaro: I'll ponder and figure out a dance that doesn't

toddaaro · 2013-08-17T03:46:38Z

Closed by PR #8566

toddaaro mentioned this issue Aug 6, 2013

Eliminate blocking on the event loop when possible #8341

Closed

toddaaro closed this as completed Aug 17, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idle callback duplication in the runtime #8340

Idle callback duplication in the runtime #8340

toddaaro commented Aug 6, 2013

toddaaro commented Aug 12, 2013

brson commented Aug 12, 2013

toddaaro commented Aug 12, 2013

toddaaro commented Aug 17, 2013

Idle callback duplication in the runtime #8340

Idle callback duplication in the runtime #8340

Comments

toddaaro commented Aug 6, 2013

toddaaro commented Aug 12, 2013

brson commented Aug 12, 2013

toddaaro commented Aug 12, 2013

toddaaro commented Aug 17, 2013