Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Future contexts #2900

Open
Diggsey opened this issue Apr 5, 2020 · 17 comments
Open

Custom Future contexts #2900

Diggsey opened this issue Apr 5, 2020 · 17 comments

Comments

@Diggsey
Copy link
Contributor

Diggsey commented Apr 5, 2020

Currently, the Context type passed into a future is fixed. However, I think there would be a lot of value in either allowing custom context types, or allowing some way to access custom data on a Context.

Actix

Actix has a model where each actor in the system acts like its own mini executor. You can spawn futures onto an actor's executor, and those futures have mutable access to the actor's state. This is essential to the way the actor model works. However, this is not achievable with standard futures, and so actix defines its own ActorFuture trait.

The only difference is that the poll method takes additional mutable references to both the actor and its executor context. If these pieces of information were accessible somehow from the normal future context then actix would not need a custom future trait. Furthermore, it could even benefit from async/await syntax with a minor extension.

Tokio

Tokio (and I believe async-std?) implements several futures which are "privileged" - ie. they must be polled on a tokio executor or they will fail. However, there's currently no way to check this at compile time. If it was possible to use custom contexts, then these futures could require a tokio-sp;ecific context to be polled.

Implementation

The blocker for this is that the context has a lifetime parameter, and so we need some form of HKT to make this work. Luckily, the one form of HKTs that we do have (GATs) should be enough to allow it, assuming they are ever stabilized.

pub trait ContextProvider {
    type Context<'a>;
}

pub trait Future<P: ContextProvider=TaskContextProvider> {
    type Output;

    fn poll(self: Pin<&mut Self>, cx: P::Context<'_>) -> Poll<Self::Output>;
}

pub struct TaskContextProvider;

impl ContextProvider for TaskContextProvider {
    type Context<'a> = &'a mut std::task::Context<'a>;
}

This would allow actix and tokio to define their own context providers, and in general, executors would be able to avoid TLS entirely.

Trait objects and interoperability

One of the problems with making futures generic is that it's both more difficult to use runtime polymorphism, and it can also divide the ecosystem. I think this can be avoided by creating a trait to encapsulate the current behaviour of Context, and requiring that new contexts implement a superset of that functionality. This means that all futures would easily be convertible up from a "normal future" into one that accepts a custom context.

There will also be ways to convert in the other direction: if you want to convert an actor future into a normal future, you just spawn it onto an actor. If you want to convert a tokio future to a normal future you spawn it onto the tokio executor.

Async/await

In order for async/await to work cleanly with custom contexts, there needs to be a way for code to access the context. There has previously been talk of fully generalizing the async/generator transformation to allow passing in arbitrary arguments, but I think we can avoid that complexity: simply provide an intrinsic which can be called from inside async functions and code blocks which gives the current context:

use std::task;

async fn foo() {
    task::current::<Context>().waker().wake_by_ref();
}

Also, this solves the problem of how to tell the compiler what contexts an async function supports: if the task::current intrinsic is not used, then the anonymous type will generically implement Future for all context types. If task::current is used, the type of the context is known.

@Ekleog
Copy link

Ekleog commented Apr 6, 2020

For the record, similar ideas have already been discussed before stabilization of the Future trait -- not saying this against this proposal, though… quite the contrary, actually, I'm all in favor of it, hoping that we can retrofit this and make async/await generate the proper bounds on the Context.

In no particular order, hoping I don't include irrelevant stuff, and pointing to the first comment in the series of comments that are relevant:
#2900
rustasync/team#7 (comment)
rust-lang/futures-rs#1196
#2592 (comment)
#2592 (comment)
#2592 (comment)
rustasync/team#56 (comment)

@skade
Copy link
Contributor

skade commented Oct 12, 2020

Tokio (and I believe async-std?) implements several futures which are "privileged" - ie. they must be polled on a tokio executor or they will fail. However, there's currently no way to check this at compile time. If it was possible to use custom contexts, then these futures could require a tokio-specific context to be polled.

async-std futures do not need to be polled in an async-std context. tokio futures also kinda don't need to. The main difference is how timers are implemented: tokio puts timers on the runtime threads, async-std has a separate thread. That means tokio futures relying on timers cannot be polled from plain threads.

There are compat libraries that "upgrade" a thread to having tokio timers allocated.

The problem here is essentially that futures (as all other Rust data structures!) rely on environmental setup. This may be the runtime, the kernel or an initialised background logger. I do not think that encoding all this complexity in the type system is worth the work, especially as those are "hard panic" scenarios that fail immediately and reliably at initialisation.

@Alexei-Kornienko
Copy link

I think that adding:

user_data: Option<Box<dyn Any>>

to a Context would be usefull in many scenarios..

@shepmaster
Copy link
Member

Box<dyn Any>

I may be misunderstanding, but that would restrict futures to only the platforms that have liballoc. Futures already work on platforms where that’s not present, so that suggestion seems to be a non starter.

@burdges
Copy link

burdges commented Mar 27, 2021

It's true Box only exists if you have liballoc but..

It's possible to build a TinyBox<T> type with an alloc feature that allocates only if T has size or alignment larger than usize, but works without its alloc feature and liballoc by simply panicing if anytime allocation would otherwise occur. rust-lang/project-error-handling#20

I've no idea if or why this matters here, but it's independently useful like pub type Error = TinyBox<dyn Error>;

@Alexei-Kornienko
Copy link

I may be misunderstanding, but that would restrict futures to only the platforms that have liballoc. Futures already work on platforms where that’s not present, so that suggestion seems to be a non starter.

In theory it should not be required to have liballoc to use it. You can construct Box from pointer to stack/statically allocated memory (which is unsafe but still usable). Basically my opinion is that you should be able to pass some custom data with context and it needs to store pointer to such data. So for me an Option<Box> is just a safe way of defining this pointer.

@Kixiron
Copy link
Member

Kixiron commented Mar 27, 2021

In theory it should not be required to have liballoc to use it. You can construct Box from pointer to stack/statically allocated memory (which is unsafe but still usable). Basically my opinion is that you should be able to pass some custom data with context and it needs to store pointer to such data. So for me an Option is just a safe way of defining this pointer.

As you can see from this playground snippet, you cannot construct boxes that point to static memory soundly (the same applies for stack-based, except it arguably has even more footguns). Limiting the usage of futures to targets with alloc is incredibly short-sighted and completely disenfranchises an entire segment of the rust community as futures are used within embedded code that doesn't necessarily have access to alloc as demonstrated in Phil Opp's OS series. This is also incredibly not zero-overhead as it requires both the usage of heap allocation and dynamic casting, rendering Box<dyn Any> (along with any similar proposal) a non-starter for Rust.

@Alexei-Kornienko
Copy link

Alexei-Kornienko commented Mar 27, 2021

Well the simple solution would be to store just a raw pointer. This would make it much simpler in terms of compatibility and slightly more complicated in terms of usage of such data. Something like:
user_data: *mut u8

@jamesordner
Copy link

What would it take to progress this issue? It seems that the current point is agreeing on how to allow access to "user data" within Context. What are the issues with @Alexei-Kornienko's suggestion above, to just add a pointer to Context?

It would be incredibly useful to allow an executor to pass context to futures, not only to avoid TLS but also to allow cleaner design and thread resource management.

@Catherlock
Copy link

Catherlock commented Jan 30, 2022

Enforcing Context that can provide only waker is making it difficult to implement other designs with syntactic sugar of async block. I have the same problem and find this issue. Since future can't get the reactor from the context, they are forced to get it from a global variable made by lazy_static or once_cell crates. This is not natural and causes problems like tokio's compatibility. Custom context forces leaf future to be polled by an intended executor and it can prevent programmer from facing compatibility issues.

@nikomatsakis
Copy link
Contributor

I think this is related to the context/capabilities proposal that @tmandry, @yoshuawuyts, and myself have been looking into.

I'm definitely interested in the idea of custom contexts of some kind, but I'm not totally sure what that looks like yet.

@khoover
Copy link

khoover commented Mar 15, 2022

I think this is related to the context/capabilities proposal that @tmandry, @yoshuawuyts, and myself have been looking into.

That seems a lot more general than what's being asked for here: some sort of intrinsic for accessing the Context inside an async scope (compiles into using the existing context parameter of the generated .poll), and the ability to retrieve caller-specified data from the Context (possibly like Extensions from tracing-subscriber, or introducing a generic Context, or even a plain *const () to be casted into the correct type).

My use case for this is writing a custom executor in a WASM context, where the executor is created by the host and lives on the heap with some associated data attached to it. The host side calls into the executor, the executor updates the associated data, it runs for a bounded amount of CPU time (modulo how cooperative the spawned tasks are), and then returns back to the host. I could hoist this associated data and the Spawner into thread-local RefCells, but that's brittle due to needing to remember to drop the borrows before every .await, and adds extra complexity to the executor (e.g. a task is spawned, and then the handle is immediately .awaited, before the executor has a chance to see that the task was ever spawned).

Having Context types with user-data would simplify a lot of this; I would pass in an exclusive reference to the associated data, and a Spawner that is a thin wrapper around an exclusive reference to the internal task list. The references can't be held on to over .awaits, due to lifetime bounds, and tasks are known to the executor the moment .spawn is called (vs. needing to wait until the spawner returns back to the executor).

@khoover
Copy link

khoover commented Mar 15, 2022

I've written up a proposed interface (names are placeholders) at this playground, along with a couple examples of using it (in particular, the "passing-references-to-Futures" use-case, and what happens when a Future tries to hold on to a reference without the lifetimes matching).

@nikomatsakis
Copy link
Contributor

It is definitely more general than what is being asked for here, no doubt about that. But it has the advantage of being statically analyzed and verifiable. I'm concerned because we already have problems where libraries are implicitly dependent on one executor or another, and it seems like these problems could be exacerbated. (That said, I certainly see the "perfect enemy of good" argument here.)

Just to be sure, am I missing something? I think that if we were to extend Context with some kind of "type-based query mechanism", you'd have futures that expect to be invoked with a context that includes that info, and sometimes it would be absent.

@khoover
Copy link

khoover commented Mar 16, 2022

Hmm. Thinking about this more, it seems like with makes more sense than trying to augment Context or introduce a different Future trait, I can agree. There's still some questions around it (syntax- and semantics-wise, the proposed implementation with implicit arguments seems relatively clear), e.g. how to handle an async block .awaiting inside a with block, how to convey something like (using the basic_arena example from the post) let bar = basic_arena; fut.await; bar.alloc(...); is invalid (but this RFC1 lays out a template for that), how it interacts with dyn (would something like dyn (Future with AuxData) + ... make sense?), but those seem like tractable problems.

On the augmenting side, you have hard issues regarding composability. E.g. I want to await a Future that takes Foo as aux and another that takes Bar as aux, how do I combine that in the top-level Future? How do I distinguish, e.g., pulling it from the caller-passed aux data vs. being something the top-level Future generates on its own? With the with proposal, the top-level would just have all of the descendant Future with constraints combined, minus any that are internally satisfied.

Footnotes

  1. Incidentally, this RFC would resolve my main issue; I could have a small wrapper type around Rc<RefCell<Data>> that gives a wrapper around RefMut annotated with #[must_not_await], assuming RefMut itself isn't given that annotation, and pass clones of that to any async that needs the auxiliary data.

@douglas-raillard-arm
Copy link

Maybe this is irrelevant (I'm new to both Rust and async Rust) but the waker in the context now allows accessing a raw untyped custom pointer: https://doc.rust-lang.org/nightly/core/task/struct.RawWaker.html#method.data:

let data: &T = unsafe { &*(ctx.waker().as_raw().data() as *const _) };

My (niche) use case is to use async Rust for streaming event processing. The async processing function lives in a Linux kernel module and is attached to a tracepoint probes (called every time an event is emitted). This means I need a way to "push" the new event in the async code. Using that pointer avoids a global variable, which would be problematic as I'm targeting no_std (so I cannot use mutexes or anything like that easily), and it will also allow batch processing event traces using multiple threads in userspace.

Coroutines would probably be more adapted to my use case but it looked quite unstable, so I fell back on async.

@HanabishiRecca
Copy link

The fact that Context is mandatory for poll() seems strange.
I definitely see use cases when you want to poll in tight loop without any wakers. Or you have some custom operation where wakers do not exist.

Of course you can workaround it by providing a dummy empty context. Like such example I've found:

use std::task::{Context, RawWaker, RawWakerVTable, Waker};

fn do_nothing(_ptr: *const ()) {}

fn clone(ptr: *const ()) -> RawWaker {
    RawWaker::new(ptr, &VTABLE)
}

static VTABLE: RawWakerVTable = RawWakerVTable::new(clone, do_nothing, do_nothing, do_nothing);

fn main() {
    let raw = RawWaker::new(std::ptr::null(), &VTABLE);
    let waker = unsafe { Waker::from_raw(raw) };
    let context = Context::from_waker(&waker);
    // use the context
}

Obviously this is ugly, inconvinient and contains error-prone unsafe code to just literally do nothing.
If it can't be straight Option, then maybe something like proposed type Context<'a> = () can do the trick.

As a last resort, at least have someting like Context::default() providing an empty context similar to the example above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests