Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for LocalWaker #118959

Open
1 of 4 tasks
tvallotton opened this issue Dec 15, 2023 · 24 comments
Open
1 of 4 tasks

Tracking Issue for LocalWaker #118959

tvallotton opened this issue Dec 15, 2023 · 24 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. WG-async Working group: Async & await

Comments

@tvallotton
Copy link
Contributor

tvallotton commented Dec 15, 2023

Feature gate: #![feature(local_waker)]

This is a tracking issue for support for local wakers on Context. This allows libraries to hold non thread safe data on their wakers, guaranteeing at compile time that the wakers will not be sent across threads. It includes a ContextBuilder type for building contexts.

Public API

impl Context {
    fn local_waker(&self) -> &LocalWaker;
}

impl<'a> ContextBuilder<'a> {
   fn from_waker(waker: &'a Waker) -> ContextBuilder<'a>;
   fn waker(self, waker: &'a Waker) -> ContextBuilder<'a>;
   fn local_waker(self, local_waker: &'a LocalWaker,) -> ContextBuilder<'a>
   fn build(self) -> Context;
}

impl From<&mut Context> for ContextBuilder;

pub trait LocalWake {
    fn wake(self: Rc<Self>);
}

Steps / History

Unresolved Questions

  • Should runtimes be allowed to not define a waker?

Relevant links

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@tvallotton tvallotton added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Dec 15, 2023
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 5, 2024
…ulacrum

Add LocalWaker and ContextBuilder types to core, and LocalWake trait to alloc.

Implementation for  rust-lang#118959.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 5, 2024
…ulacrum

Add LocalWaker and ContextBuilder types to core, and LocalWake trait to alloc.

Implementation for  rust-lang#118959.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 5, 2024
…ulacrum

Add LocalWaker and ContextBuilder types to core, and LocalWake trait to alloc.

Implementation for  rust-lang#118959.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 5, 2024
Rollup merge of rust-lang#118960 - tvallotton:local_waker, r=Mark-Simulacrum

Add LocalWaker and ContextBuilder types to core, and LocalWake trait to alloc.

Implementation for  rust-lang#118959.
@tvallotton
Copy link
Contributor Author

tvallotton commented Feb 12, 2024

@zzwxh

If I don't intend to support cross-thread wake-ups, it's best not to force me to provide a dummy Waker that pretends to support cross-thread wake-ups.

I agree, it would be best if users could panic on the call to cx.waker() rather than in the call to waker.wake().

ContextBuilder should allow the asynchronous runtime to explicitly specify which type of Waker it supports (or both), and the Future should have the ability to retrieve this information.

The feature used to work exactly like this, offering a ContextBuilder::from_local_waker method, and a Context::try_waker method. However, too many concerns were raised about compatibility, and I decided that this was not a hill I was willing to die on. Note that it is still possible to reintroduce these methods in the future without breaking changes.

@Thomasdezeeuw
Copy link
Contributor

Context::try_waker does not introduce breaking changes, and I don't understand what concerns anyone has. We just need to deprecate Context::from_raw and Context::waker and use the new API instead.

I'm not sure you know what you're saying here. "Just" deprecating a crucial API for all Futures is bad idea. It will break all existing Futures and creates massive churn.

@tvallotton
Copy link
Contributor Author

@zzwxh

Context::try_waker does not introduce breaking changes

Yes, this is what I said.

We just need to deprecate Context::from_raw and Context::waker and use the new API instead.

No, Context::waker and Context::from_waker are still useful and should not be deprecated, even if we decide to support optional wakers.

@Thomasdezeeuw
Copy link
Contributor

@zzwxh it's not ok that you deleted your comment. It's hard to read back a discussion that way.

bors pushed a commit to rust-lang-ci/rust that referenced this issue Feb 18, 2024
…ulacrum

Add LocalWaker and ContextBuilder types to core, and LocalWake trait to alloc.

Implementation for  rust-lang#118959.
jhpratt added a commit to jhpratt/rust that referenced this issue Apr 3, 2024
Add `Context::ext`

This change enables `Context` to carry arbitrary extension data via a single `&mut dyn Any` field.

```rust
#![feature(context_ext)]

impl Context {
    fn ext(&mut self) -> &mut dyn Any;
}

impl ContextBuilder {
    fn ext(self, data: &'a mut dyn Any) -> Self;

    fn from(cx: &'a mut Context<'_>) -> Self;
    fn waker(self, waker: &'a Waker) -> Self;
}
```

Basic usage:

```rust
struct MyExtensionData {
    executor_name: String,
}

let mut ext = MyExtensionData {
    executor_name: "foo".to_string(),
};

let mut cx = ContextBuilder::from_waker(&waker).ext(&mut ext).build();

if let Some(ext) = cx.ext().downcast_mut::<MyExtensionData>() {
    println!("{}", ext.executor_name);
}
```

Currently, `Context` only carries a `Waker`, but there is interest in having it carry other kinds of data. Examples include [LocalWaker](rust-lang#118959), [a reactor interface](rust-lang/libs-team#347), and [multiple arbitrary values by type](https://docs.rs/context-rs/latest/context_rs/). There is also a general practice in the ecosystem of sharing data between executors and futures via thread-locals or globals that would arguably be better shared via `Context`, if it were possible.

The `ext` field would provide a low friction (to stabilization) solution to enable experimentation. It would enable experimenting with what kinds of data we want to carry as well as with what data structures we may want to use to carry such data.

Dedicated fields for specific kinds of data could still be added directly on `Context` when we have sufficient experience or understanding about the problem they are solving, such as with `LocalWaker`. The `ext` field would be for data for which we don't have such experience or understanding, and that could be graduated to dedicated fields once proven.

Both the provider and consumer of the extension data must be aware of the concrete type behind the `Any`. This means it is not possible for the field to carry an abstract interface. However, the field can carry a concrete type which in turn carries an interface. There are different ways one can imagine an interface-carrying concrete type to work, hence the benefit of being able to experiment with such data structures.

## Passing interfaces

Interfaces can be placed in a concrete type, such as a struct, and then that type can be casted to `Any`. However, one gotcha is `Any` cannot contain non-static references. This means one cannot simply do:

```rust
struct Extensions<'a> {
    interface1: &'a mut dyn Trait1,
    interface2: &'a mut dyn Trait2,
}

let mut ext = Extensions {
    interface1: &mut impl1,
    interface2: &mut impl2,
};

let ext: &mut dyn Any = &mut ext;
```

To work around this without boxing, unsafe code can be used to create a safe projection using accessors. For example:

```rust
pub struct Extensions {
    interface1: *mut dyn Trait1,
    interface2: *mut dyn Trait2,
}

impl Extensions {
    pub fn new<'a>(
        interface1: &'a mut (dyn Trait1 + 'static),
        interface2: &'a mut (dyn Trait2 + 'static),
        scratch: &'a mut MaybeUninit<Self>,
    ) -> &'a mut Self {
        scratch.write(Self {
            interface1,
            interface2,
        })
    }

    pub fn interface1(&mut self) -> &mut dyn Trait1 {
        unsafe { self.interface1.as_mut().unwrap() }
    }

    pub fn interface2(&mut self) -> &mut dyn Trait2 {
        unsafe { self.interface2.as_mut().unwrap() }
    }
}

let mut scratch = MaybeUninit::uninit();
let ext: &mut Extensions = Extensions::new(&mut impl1, &mut impl2, &mut scratch);

// ext can now be casted to `&mut dyn Any` and back, and used safely
let ext: &mut dyn Any = ext;
```

## Context inheritance

Sometimes when futures poll other futures they want to provide their own `Waker` which requires creating their own `Context`. Unfortunately, polling sub-futures with a fresh `Context` means any properties on the original `Context` won't get propagated along to the sub-futures. To help with this, some additional methods are added to `ContextBuilder`.

Here's how to derive a new `Context` from another, overriding only the `Waker`:

```rust
let mut cx = ContextBuilder::from(parent_cx).waker(&new_waker).build();
```
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Apr 3, 2024
Rollup merge of rust-lang#123203 - jkarneges:context-ext, r=Amanieu

Add `Context::ext`

This change enables `Context` to carry arbitrary extension data via a single `&mut dyn Any` field.

```rust
#![feature(context_ext)]

impl Context {
    fn ext(&mut self) -> &mut dyn Any;
}

impl ContextBuilder {
    fn ext(self, data: &'a mut dyn Any) -> Self;

    fn from(cx: &'a mut Context<'_>) -> Self;
    fn waker(self, waker: &'a Waker) -> Self;
}
```

Basic usage:

```rust
struct MyExtensionData {
    executor_name: String,
}

let mut ext = MyExtensionData {
    executor_name: "foo".to_string(),
};

let mut cx = ContextBuilder::from_waker(&waker).ext(&mut ext).build();

if let Some(ext) = cx.ext().downcast_mut::<MyExtensionData>() {
    println!("{}", ext.executor_name);
}
```

Currently, `Context` only carries a `Waker`, but there is interest in having it carry other kinds of data. Examples include [LocalWaker](rust-lang#118959), [a reactor interface](rust-lang/libs-team#347), and [multiple arbitrary values by type](https://docs.rs/context-rs/latest/context_rs/). There is also a general practice in the ecosystem of sharing data between executors and futures via thread-locals or globals that would arguably be better shared via `Context`, if it were possible.

The `ext` field would provide a low friction (to stabilization) solution to enable experimentation. It would enable experimenting with what kinds of data we want to carry as well as with what data structures we may want to use to carry such data.

Dedicated fields for specific kinds of data could still be added directly on `Context` when we have sufficient experience or understanding about the problem they are solving, such as with `LocalWaker`. The `ext` field would be for data for which we don't have such experience or understanding, and that could be graduated to dedicated fields once proven.

Both the provider and consumer of the extension data must be aware of the concrete type behind the `Any`. This means it is not possible for the field to carry an abstract interface. However, the field can carry a concrete type which in turn carries an interface. There are different ways one can imagine an interface-carrying concrete type to work, hence the benefit of being able to experiment with such data structures.

## Passing interfaces

Interfaces can be placed in a concrete type, such as a struct, and then that type can be casted to `Any`. However, one gotcha is `Any` cannot contain non-static references. This means one cannot simply do:

```rust
struct Extensions<'a> {
    interface1: &'a mut dyn Trait1,
    interface2: &'a mut dyn Trait2,
}

let mut ext = Extensions {
    interface1: &mut impl1,
    interface2: &mut impl2,
};

let ext: &mut dyn Any = &mut ext;
```

To work around this without boxing, unsafe code can be used to create a safe projection using accessors. For example:

```rust
pub struct Extensions {
    interface1: *mut dyn Trait1,
    interface2: *mut dyn Trait2,
}

impl Extensions {
    pub fn new<'a>(
        interface1: &'a mut (dyn Trait1 + 'static),
        interface2: &'a mut (dyn Trait2 + 'static),
        scratch: &'a mut MaybeUninit<Self>,
    ) -> &'a mut Self {
        scratch.write(Self {
            interface1,
            interface2,
        })
    }

    pub fn interface1(&mut self) -> &mut dyn Trait1 {
        unsafe { self.interface1.as_mut().unwrap() }
    }

    pub fn interface2(&mut self) -> &mut dyn Trait2 {
        unsafe { self.interface2.as_mut().unwrap() }
    }
}

let mut scratch = MaybeUninit::uninit();
let ext: &mut Extensions = Extensions::new(&mut impl1, &mut impl2, &mut scratch);

// ext can now be casted to `&mut dyn Any` and back, and used safely
let ext: &mut dyn Any = ext;
```

## Context inheritance

Sometimes when futures poll other futures they want to provide their own `Waker` which requires creating their own `Context`. Unfortunately, polling sub-futures with a fresh `Context` means any properties on the original `Context` won't get propagated along to the sub-futures. To help with this, some additional methods are added to `ContextBuilder`.

Here's how to derive a new `Context` from another, overriding only the `Waker`:

```rust
let mut cx = ContextBuilder::from(parent_cx).waker(&new_waker).build();
```
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Apr 3, 2024
Add `Context::ext`

This change enables `Context` to carry arbitrary extension data via a single `&mut dyn Any` field.

```rust
#![feature(context_ext)]

impl Context {
    fn ext(&mut self) -> &mut dyn Any;
}

impl ContextBuilder {
    fn ext(self, data: &'a mut dyn Any) -> Self;

    fn from(cx: &'a mut Context<'_>) -> Self;
    fn waker(self, waker: &'a Waker) -> Self;
}
```

Basic usage:

```rust
struct MyExtensionData {
    executor_name: String,
}

let mut ext = MyExtensionData {
    executor_name: "foo".to_string(),
};

let mut cx = ContextBuilder::from_waker(&waker).ext(&mut ext).build();

if let Some(ext) = cx.ext().downcast_mut::<MyExtensionData>() {
    println!("{}", ext.executor_name);
}
```

Currently, `Context` only carries a `Waker`, but there is interest in having it carry other kinds of data. Examples include [LocalWaker](rust-lang/rust#118959), [a reactor interface](rust-lang/libs-team#347), and [multiple arbitrary values by type](https://docs.rs/context-rs/latest/context_rs/). There is also a general practice in the ecosystem of sharing data between executors and futures via thread-locals or globals that would arguably be better shared via `Context`, if it were possible.

The `ext` field would provide a low friction (to stabilization) solution to enable experimentation. It would enable experimenting with what kinds of data we want to carry as well as with what data structures we may want to use to carry such data.

Dedicated fields for specific kinds of data could still be added directly on `Context` when we have sufficient experience or understanding about the problem they are solving, such as with `LocalWaker`. The `ext` field would be for data for which we don't have such experience or understanding, and that could be graduated to dedicated fields once proven.

Both the provider and consumer of the extension data must be aware of the concrete type behind the `Any`. This means it is not possible for the field to carry an abstract interface. However, the field can carry a concrete type which in turn carries an interface. There are different ways one can imagine an interface-carrying concrete type to work, hence the benefit of being able to experiment with such data structures.

## Passing interfaces

Interfaces can be placed in a concrete type, such as a struct, and then that type can be casted to `Any`. However, one gotcha is `Any` cannot contain non-static references. This means one cannot simply do:

```rust
struct Extensions<'a> {
    interface1: &'a mut dyn Trait1,
    interface2: &'a mut dyn Trait2,
}

let mut ext = Extensions {
    interface1: &mut impl1,
    interface2: &mut impl2,
};

let ext: &mut dyn Any = &mut ext;
```

To work around this without boxing, unsafe code can be used to create a safe projection using accessors. For example:

```rust
pub struct Extensions {
    interface1: *mut dyn Trait1,
    interface2: *mut dyn Trait2,
}

impl Extensions {
    pub fn new<'a>(
        interface1: &'a mut (dyn Trait1 + 'static),
        interface2: &'a mut (dyn Trait2 + 'static),
        scratch: &'a mut MaybeUninit<Self>,
    ) -> &'a mut Self {
        scratch.write(Self {
            interface1,
            interface2,
        })
    }

    pub fn interface1(&mut self) -> &mut dyn Trait1 {
        unsafe { self.interface1.as_mut().unwrap() }
    }

    pub fn interface2(&mut self) -> &mut dyn Trait2 {
        unsafe { self.interface2.as_mut().unwrap() }
    }
}

let mut scratch = MaybeUninit::uninit();
let ext: &mut Extensions = Extensions::new(&mut impl1, &mut impl2, &mut scratch);

// ext can now be casted to `&mut dyn Any` and back, and used safely
let ext: &mut dyn Any = ext;
```

## Context inheritance

Sometimes when futures poll other futures they want to provide their own `Waker` which requires creating their own `Context`. Unfortunately, polling sub-futures with a fresh `Context` means any properties on the original `Context` won't get propagated along to the sub-futures. To help with this, some additional methods are added to `ContextBuilder`.

Here's how to derive a new `Context` from another, overriding only the `Waker`:

```rust
let mut cx = ContextBuilder::from(parent_cx).waker(&new_waker).build();
```
@traviscross traviscross added the WG-async Working group: Async & await label Apr 8, 2024
@raftario
Copy link

raftario commented Sep 9, 2024

It would be nice to start work towards stabilisation for this feature as it currently exists (an opt-in performance optimisation) and move the discussion about optional wakers to its own issue.

@tvallotton
Copy link
Contributor Author

tvallotton commented Oct 16, 2024

@raftario Do you know if there is any user of this feature in the current moment, or if there are crate authors waiting for this feature to be stabilized? I don't think stabilization can be justified without any users.

@raskyld
Copy link

raskyld commented Oct 19, 2024

While it's not an hard blocker for me at the time, I am really interested in this feature!

I am working on a runtime for wasm32-wasip2 which is, by essence, a single-threaded environment.
I have no use of Sync primitives (anyway, the implementation of the standard library for this target replace them with their single-threaded counterpart if I am right) but I am sometimes forced to use them because of APIs making the assumption everyone run in multi-threaded environment.

I am wrapping a LocalPool in my implementation but to respect the safety contract of the RawWaker, I need to make the data points to thread-safe values even though I know, wasm32-wasip2 is single-threaded so, by definition, there isn't "another" thread. Having LocalWaker solves the issue.

However, I am not sure I understand how you would build a local-only Context since, in the builder, you would need to provide a fallback Waker in from_waker and then set the LocalWaker. You would probably use a noop for the thread-safe Waker but use an actual RawWaker with single-threaded data for the LocalWaker. That's really weird but I guess it's fine since you expect the implementor of the Executor to know whether to use Context::waker or Context::local_waker right?

@raftario
Copy link

I'm also personally interested in this as I've been working on a thread-per-core runtime where tasks never leave the thread that spawned them, and it would be much easier if I didn't have to worry about task wakers being sent to other threads, even when none of the futures the runtime itself provides need it. I can also imagine some of the existing thread-per-core runtimes could use this feature quite extensively for similar reasons.

@kpreid
Copy link
Contributor

kpreid commented Oct 20, 2024

However, I am not sure I understand how you would build a local-only Context since, in the builder, you would need to provide a fallback Waker in from_waker and then set the LocalWaker. You would probably use a noop for the thread-safe Waker but use an actual RawWaker with single-threaded data for the LocalWaker. That's really weird but I guess it's fine since you expect the implementor of the Executor to know whether to use Context::waker or Context::local_waker right?

It’s not fine, because the implementor of the executor is not necessarily the implementor of all futures that need to obtain a waker; in particular, these types of futures will be broken:

  • future combinators which create their own wakers to more precisely handle polling (e.g. FuturesUnordered/FuturesOrdered/JoinAll) that wrap the provided waker
  • channel receiver futures
  • and, in general, any other leaf futures that are not specialized for the executor in question

In order for these situations to be detected instead of turning into lost wakeup bugs, you would have to write a Waker that actually panics when used, not a noop waker. And even then, you're locked out of using channels etc.

Personally, I think that LocalWaker isn't going to be useful (except as an optimization) until such time as there’s a way to create a Context that tells its users it has only a LocalWaker, or that the LocalWaker should be preferred (which, I suppose, could be done using a common helper library and the Context::ext() feature). Without one of those, there’s no way to tell all of the above types of futures that they need to switch to LocalWaker. I don't think it makes sense to stabilize LocalWaker until a solution to this problem is chosen, because the solution might affect what the Context/ContextBuilder methods related to LocalWaker look like.

@raskyld
Copy link

raskyld commented Oct 20, 2024

That's what I thought.. thanks for clarifying!

I share your opinion, we shouldn't stabilise the feature as it stands today. That also means, we would require current users of Context (and Future, which is kind of pervasive) to change their control flow to account for the possibility of a LocalWaker.

I think putting Send and Sync constraints on Waker was premature optimisation. Now, have an abstraction that costs us more overhead than it should. But I guess we are in a kind of dead-end..

@raftario
Copy link

I'm personally of the opposite opinion; I don't think this feature will ever have a path to stabilization as anything more than an opt-in optimisation.

Let's start with the proposition to make Context::waker return an Option<Waker> and assume it was realistic to update every single implementation of Future in the ecosystem to take it into account. Even then this approach would have very little advantage over just providing a panicking Waker.

Any future which can, at runtime, use either a Waker or a LocalWaker is going to be !Send + !Sync. So any such future has no use matching on the Option<Waker> and should go straight for the LocalWaker.

On the other hand, a future that does need the Send + Sync bounds - which in practice is the vast majority of futures in the ecosystem, since most people use Tokio and by virtue of having a work-stealing executor Tokio requires Send + Sync - also has little use for an Option<Waker>, because it will always fail on None. The only advantage over a panicking waker here is that the future implementation can propagate the error itself. I'd argue this is not worth it at all, because it introduces a branch in every future implementation and in practice most implementations would probably also just panic.

The other option is to add a generic parameter to Context. Adding a generic parameter to the Future trait has already been discussed in this RFC. Such a change would actually fully remove the need for this feature in the first place.

So unless I'm missing a secret third option this leaves us with the current API. Existing executors can either continue to provide only a Waker or add an opt-in LocalWaker as an optimisation. Existing futures which are Send + Sync continue using a Waker. Existing futures which are !Send + !Sync can migrate to LocalWaker.

New executors can choose to only provide a working LocalWaker and have their Waker panic. As @kpreid mentioned, this would make the executor incompatible with pretty much every Send + Sync future. I would personally argue that this is perfectly fine. There's a lot of precedent for this in the async ecosystem, with each runtime providing its own I/O futures which are incompatible with every other runtime. I'd argue this incompatibility would actually be a lot less annoying than the I/O situation since LocalWaker-only executors would most likely be targetted at WASM and embedded, and both ecosystems are already used to dealing with incompatibilities and have their own crates for a lot of stuff. Having specialised !Send + !Sync futures also makes a whole lot of sense when considering the implementation of things like channels become a whole lot simpler when they don't have to be thread-safe.

@tvallotton
Copy link
Contributor Author

I'm with @raftario here. The feature as it's stands can be used as a mere optimization, or as a replacement of waker. It is up to the crate author to decide how to use the feature, and both uses are valid.

LocalWaker only executors will always be incompatible with the rest of the ecosystem, regardless of what kind of API we offer. So I don't think we should worry too much about that use case.

@yaroslavros
Copy link

This feature would be very useful for io_uring based runtimes that are strongly biased towards thread-per-core and benefit from !Send + !Sync futures.

Would be great to stabilise it sooner rather than later as a optimisation/replacement of a Waker and let runtimes decide if they want to focus just on thread-per-core model or support both kinds of wakers/futures and maintain necessary compatibility layers depending on the use cases.

@drewcrawford
Copy link
Contributor

Driving by as the mythical user of this API, writing an executor which would use the API, writing a code comment why I'm doing something else, so that I have a link to link to.

I mostly agree with @raftario 's view. IMO at this point it makes sense for "Rust generally" to pursue multithreaded async for wide compatibility (futures that are Send, Wakers that are Send/Sync, etc.) and to choose designs that basically don't break existing code. But it also makes sense for "minority Rust" to pursue a local-only flavor, and I think the lack of that as an option in the ecosystem has a real impact.

By analogy, "Rust generally" is std, while "minority Rust" is written to no_std (or restricts its use of std to only certain topics). And the analogy goes pretty deep: no_std exists in a context where spawning another thread may not even be possible, meanwhile at the same time you are targeting low-power embedded hardware...

Personally, I think that LocalWaker isn't going to be useful (except as an optimization)

Personally, I come from the opposite presumption. If we consider a platform without threading, from this point of view it is the threadsafe abstractions which are not useful. Of course they are useful for "general Rust". However considering our (non-general) platform they solve a "problem" which is imaginary because we have no threads and no race conditions can ever occur. Meanwhile the cost of those abstractions, both in complexity for the programmer and at runtime, is not imaginary. Meanwhile we are also trying to code to low-performance hardware, and so optimization is less a parenthetical remark and more of a hard requirement of a systems language.

until such time as there’s a way to create a Context that tells its users it has only a LocalWaker, or that the LocalWaker should be preferred

So, my main opinion is. Most of the real-world code I write has these properties:

  • there's a fast way to do it by leveraging closely-held assumptions
  • there's also a compatible way to do it by not assuming those assumptions

Application A: when I implement a Future, there may be a fast implementation that assumes I'm executing on the right thread (Future: !Send). And a compatible implementation that makes no such assumption (Future: Send). Depending on my requirements I may write both implementations, offer both APIs and return both distinct Future types. And so then the fast Future:!Send type would leverage LocalWaker (maybe exclusively, or maybe falling back to Waker if LocalWaker isn't available). And the compatible Future:Send type would leverage Waker (maybe exclusively, or maybe trying LocalWaker first, on the theory the executor has some fast-vs-compatible tradeoff that is outside my view when I am implementing the Future).

Application B: Meanwhile when I write an executor, there may be a fast way that assumes I'm on the right thread (LocalWaker). And a compatible way that makes no such assumption (Waker). Unlike some cases discussed where it is thought an executor would ship a panicking Waker, in practice I am likely to ship two implementations, where one is fast and one is compatible. Today in stable Rust, I can only ship the compatible one.

(which, I suppose, could be done using a common helper library and the Context::ext() feature). Without one of those, there’s no way to tell all of the above types of futures that they need to switch to LocalWaker.

Something I have not seen discussed is that we have the same problem about which waker to use inside the stable Waker API, just with instances instead of types. That is: it might be fine to re-use an old Waker instance (if it will_wake the same value) or the Waker might be stale and we have to use a different one. Similar to the question of whether or not to use LocalWaker vs Waker, the question of whether to use a particular Waker instance vs Waker is annoying to deal with for people "just" writing a future, so in practice their use of wakers is intermediated by a crate such as atomic-waker which encodes runtime checks for which Waker to use, these checks are used by 38 downstream dependencies, and many more indirect ones.

So to my mind, the obvious short-term consequence of shipping this API is it will be adopted as part of a runtime check in exactly two places: 1) atomic-waker, where we encode the canonical runtime checks for Send futures. 2) Some other crate, let's call it unatomic-waker, where we encode the canonical runtime checks for !Send futures. And so the rubric for this is not will the long tail of Future implementations adopt this as an optimization for minor performance improvement, which seem hard. Instead the rubric is: will these two particular crates adopt the API, which by itself improves Rust performance generally.

And then in the long term, the two crates are effectively some kind of consensus has to how people actually write futures and the API they actually want, and then that consensus can be merged into the stdlib, raise tide for all boats, etc. Whereas at the moment, the only consensus that can be formed in stable Rust is along the lines of Waker and multithreaded executors.

@Dirbaio
Copy link
Contributor

Dirbaio commented Nov 15, 2024

By analogy, "Rust generally" is std, while "minority Rust" is written to no_std (or restricts its use of std to only certain topics). And the analogy goes pretty deep: no_std exists in a context where spawning another thread may not even be possible, meanwhile at the same time you are targeting low-power embedded hardware...

Maintainer of Embassy here.

In embedded devices, you wake wakers from interrupts. It's true in bare-metal embedded you don't have threads, but interrupts behave sort of like threads: you need Send to send data between main and interrupts. (the reason is when an interrupt fires the hardware will "hijack" the only thread of execution and make it jump to the interrupt handler, and this "hijacking" can happen at any time, at any instruction. Interrupts are very similar to Unix signals)

So, any Future implementation that does IO with the hardware (most of them) can't use LocalWaker, it'll still have to use Waker.

Also, with today's embassy-executor design, waking a LocalWaker would have the same perf as Waker, because it'd have to take a critical section to update the task queue anyway to avoid races because the Send Waker still exists. It wouldn't make a difference to cloning and dropping wakers either, because these do nothing (there's no refcounting, tasks are statically allocated).

So I don't see any benefit of LocalWaker for Embassy.

I'm also a bit concerned that it's possible to create a Context with only LocalWaker, not Waker. This is a break of today's Future contract, and will split the ecosystem in two (executors that support cross-thread waking, executors that don't) in an especially bad way, because you get runtime panics if you mix incompatible executors at futures, it's not checked at compile-time.

@kpreid
Copy link
Contributor

kpreid commented Nov 15, 2024

I'm also a bit concerned that it's possible to create a Context with only LocalWaker, not Waker.

Currently, it is not possible to construct a Context without a Waker; the LocalWaker is strictly optional, not an alternative.

(I agree that this is good in that it avoids splitting the ecosystem — but it is also bad in that pure platform-independent no_std code still cannot put together an executor and futures for its own non-interrupt-IO purposes, without creating a nonfunctional Waker.)

@Dirbaio
Copy link
Contributor

Dirbaio commented Nov 15, 2024

Currently, it is not possible to construct a Context without a Waker

Ah that's good to know! I was going from OP, which still shows fn from_local_waker(self) -> Self; in the proposed API...

@Darksonn
Copy link
Contributor

Every time I hear "just use X when you need Send and Y when you need !Send", alarm bells go of in my head. I think it's a red flag for bad API design. You should usually not create two separate tools to handle <T: Trait> and <T: !Trait> because they usually do not compose well, and usually do not give a solution to <T>. I've seen it in several places:

  • This is why it's so important to support RTN for async traits. "Just create two traits" is not workable.
  • The MinPin proposal was criticized for similar reasons, but with the Unpin trait.
  • There are other cases I can't think of right now.

This proposal sets off the same alarm bells for me.

@dilr
Copy link

dilr commented Dec 20, 2024

Admittedly, I haven't tried using the API, but from what I understand from looking at its source code, Waker and LocalWaker should compose pretty well.

If a Context is only built with a &Waker, than it will return that same &Waker cast to a &LocalWaker if context.local_waker() is called. However, if Context is built with an additional &LocalWaker reference, then local_waker() will return that reference instead. Since constructing a Context always requires a &Waker, it is always possible to call both waker() and local_waker() on any Context.

This means that once this API is stabilized, pretty much all futures which directly call or store a waker can use local_waker() if they don't intend to send the waker to another thread, or waker() if they do intend to send to another thread. This will always work regardless of how the Context was constructed, as explained in the previous paragraph.

Thus there is an optimization if both the executor provided a &LocalWaker and the future calls local_waker(). However, if either of these things are not true, then the system as a whole just falls back to using a regular Waker. To me it seems like the system doesn't have any significant downsides in terms of code flexibility.

EDIT: That being said, the only time I can see this optimization kicking in is if one was using asynchronous API provided by the OS. (such as io-uring or IOCP) As @Dirbaio pointed out, LocalWaker isn't all that useful for embedded devices even though they may be single threaded.

@Thomasdezeeuw
Copy link
Contributor

@dilr the problem, in my opinion, isn't in using a LocalWaker vs Waker directly as you say that's solved pretty well. The problem is in storing it.

For example consider a generic Future wrapping type that needs to store a Waker or LocalWaker. Here the generic type needs to make a decision: does the type need to be Send? Since it's a generic type it's hard to know, that's really for the caller/user to decide. Which means you have to make a decision, which in practice will most like be Waker because it has stable longer and allows it to be Send. Alternative you can have type X and SyncX or LocalX to have a sync and non-sync versions of your generic type. Personally I haven't found a satisfiable solution to this yet.

@dilr
Copy link

dilr commented Dec 21, 2024

This argument generally makes sense, but in practice I don't think it will be an issue. I'm glad to be proven wrong though. Here's why I don't think there is a problem:

As I see it, there are roughly 3 types of futures people write. The first is normal async fn application code. These don't store Wakers directly and simply pass them to their children. The second is future combining functions/macros such as join! or select!. These also do not store their own Wakers, and just pass the context they receive to their children. The third type of future is the raw IO or spawning futures such as fs::read(..) or runtime::spawn(..) (both from tokio). These third types do need to store a Waker to tell the executor when they are done with their task. However, this third type is usually attached to a particular implementation, and so knows if the Waker will need to travel across threads in order to be useful. I'm sure there are exceptions, but I think most of this third type of future will be able to make a concrete decision about how it is used. The exceptions can continue to use the normal context and be slightly slower as a result.

Looking at the two examples I gave for tokio: tokio's runtime for file IO uses a threadpool, so fs::read(..) would need a Waker. Meanwhile, tokio's executor is a multithreaded work stealing queue, so runtime::spawn(..) would also need a Waker. Technically tokio can run as a single threaded executor as well, (and provides a spawn_local function if within that executor) but tokio just isn't optimized for that case. They only seem to provide it to allow you to run !Send futures.

EDIT: I think I see what you mean. The future produced by join! for instance would only implement Send if its children do, and storing a LocalWaker inside a child would make that child !Send. This propagates back up to the executor or spawn function. This isn't a problem for the join! future, as it shouldn't care about its own Sendness. However, it would make it hard for IO futures to store LocalWakers if they don't know whether their executor is local.

@kpreid
Copy link
Contributor

kpreid commented Dec 21, 2024

The second is future combining functions/macros such as join! or select!. These also do not store their own Wakers, and just pass the context they receive to their children.

Some future combinators like FuturesUnordered create their own wakers, which then delegate to stored wakers taken from the Context. Those would need to know when to store and delegate to a LocalWaker instead. Also, there is a fourth important class of futures: channels, and things like them. Like IO futures, the sending side of a channel must store a waker, but unlike IO futures, they do not need to be aware of anything about the executor implementation — in fact, an important use case for async channels is to bridge between multiple executors or to non-async code with no executor. (I already listed these cases previously in the discussion.)

@Thomasdezeeuw
Copy link
Contributor

Another concrete example of storing Wakers is my A10 crate. It uses Futures to drive async I/O, which store Wakers to wake them up once the I/O operation is complete. LocalWaker would work just as well, but would make all the types !Send.

@tvallotton
Copy link
Contributor Author

For example consider a generic Future wrapping type that needs to store a Waker or LocalWaker. Here the generic type needs to make a decision: does the type need to be Send?

Yes, if you want it to be compatible with the entirety of the ecosystem, then it needs to be Send. This is fine, we don't always need to use a local_waker when one is available. Local wakers are an optimization, and like most optimizations, they aren't always applicable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. WG-async Working group: Async & await
Projects
None yet
Development

No branches or pull requests