Skip to content

Idle callback duplication in the runtime #8340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
toddaaro opened this issue Aug 6, 2013 · 4 comments
Closed

Idle callback duplication in the runtime #8340

toddaaro opened this issue Aug 6, 2013 · 4 comments
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows

Comments

@toddaaro
Copy link
Contributor

toddaaro commented Aug 6, 2013

The way the runtime is currently structured every time "something happens" a callback is registered with a libuv event loop that wakes up that loop to deal with it. This could be "work available to steal" or "message sent to scheduler" for example. The problem with these is that we create massive numbers of the things when we really only need one per scheduler. If the scheduler needs to have a callback registered, register the singleton callback for that scheduler. Doing this would eliminate huge amounts of allocation and many many calls out to libuv functions.

@toddaaro
Copy link
Contributor Author

This is the design I have been thinking about over the weekend:

  1. A SchedHandle stops holding a "new" callback to ping. Instead it includes a reference to a singleton idle callback in the scheduler struct.

  2. The Scheduler struct gains a new field, an atomic boolean that represents the "pending callback" state of the scheduler.

  3. The new path to ping the singleton callback first checks this atomic boolean. If there is already an idle callback pending the value will be "true" and the callback ping is skipped. If the value is false, it means that a callback ping needs to be sent. The flag is CASed to true, and the pinger that successfully does CAS does the callback registration.

  4. The first step in run_sched_once is to set the flag to false, as there is no callback pending as the scheduler is active. If the scheduler decides to send itself a callback it uses the same path as (3).

This should be safe, as the CAS on the atomic boolean guarantees that only one thread uses the callback at a time, and the way a scheduler is destructed is by waiting until the handles are done, so the handle can't outlive the callback it points to.

One tricky situation is where a thread can decide to send a callback right as the run_sched_once function starts, observe the flag as 'true', and skip the callback. The scheduler could then swap the flag to 'false' "missing" that attempted callback registration. As long as the scheduler can send itself a callback when there is more work to do this should be a non-issue, but we need to make sure we can answer that question correctly.

I don't think the ABA problem will be an issue here. If it is, we can use an atomic uint instead and do n%2 to find a boolean value. The extra bits just provide a "count" of where we are so ABA issues go away.

This should be very performant, as the absolute minimum amount of work is done. No extra callbacks are registered, and no allocation is required to make a "new" callback. There is one atomic memory operation per callback attempt, one callback object, and callbacks are only pinged when there isn't already one pending.

The major design question is whether or not the flag is required. Libuv might be doing something equivalent internally, making this atomic op redundant. It does hold two roles though, both protecting the single callback object from concurrent use, and also eliminating "duplicate" callbacks. Libuv likely doesn't do both of these things, so it is very likely we should use the flag.

@brson
Copy link
Contributor

brson commented Aug 12, 2013

My comments from IRC:

11:04 @brson toddaaro: this description seems to mix the idle and async callbacks, sugesting that the idle callback is registered from another thread, which can't happen.
SchedHandles use async callbacks to wake up the scheduler remotely, for which you might use atomics to short-circuit, and idle callbacks are for running the
scheduler locally
11:04 -!- ecr [Thunderbir@moz-BBE3ABD.mv.mozilla.com] has joined #rust
11:06 @brson toddaaro: there are two distinct optimizations to make here: 1) stop running the idle callback so much and 2) stop firing async callbacks when they aren't needed

@toddaaro
Copy link
Contributor Author

Paraphrasing some IRC:
toddaaro: while distinct, they have roughly the same purpose so it would be nice to optimize them together somehow
brson: adding atomic synchronization cost to idle callbacks would be pretty bad
toddaaro: I'll ponder and figure out a dance that doesn't

@toddaaro
Copy link
Contributor Author

Closed by PR #8566

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows
Projects
None yet
Development

No branches or pull requests

2 participants