Add support for async functions which run to completion #1

Matthias247 · 2020-03-22T19:32:46Z

This RFC introduces a new type of asynchronous functions, which have
run-to-completion semantics compared to cancel-anytime semantics.

This RFC introduces a new type of asynchronous functions, which have run-to-completion semantics compared to cancel-anytime semantics.

LucioFranco

This looks fantastic! No real complaints just left a few nits/comments.

text/0000-run-to-completion-async-fn.md

LucioFranco · 2020-03-26T16:41:41Z

text/0000-run-to-completion-async-fn.md

+
+The proposal adds 2 new items to Rust:
+- A new `Future` type, which will reside in `core::task`. In the remains of this
+  document we will refer to this type as `RunToCompletionFuture`.


Suggested change

document we will refer to this type as `RunToCompletionFuture`.

document we will refer to this type as `CompletionFuture`.

What about that? :)

I thought about it too. Definitely more concise. I was only not sure whether it conveys that well what it is. I'm personally fine with both

Imo I find it says the same thing but we can always leave the bike shedding up to the real post :)

I agree that CompletionFuture is more concise.

text/0000-run-to-completion-async-fn.md

LucioFranco · 2020-03-26T17:13:27Z

text/0000-run-to-completion-async-fn.md

+
+Users will continue to be able to write simple and safe `Future` implementations
+with cancel-anytime semantics manually, and be able to use them in combination
+with powerful flow-control macros like `select!` and `join!`. A lot of types


The big issue I see is that this affects leaf futures and therefore would require the entire future chain to be a completion future. Which is quite unfortunate.

I guess if we provided some tokio::spawn_completion type then you could use its join handle to "forget" about that. That should be a somewhat and minimal way to not have to funnel completion up the chain.

You mean and leave it running in the background?
I think in order to avoid having those runaway tasks we will implement the scope proposal to work based on this.

Basically, what I have now for uring does this, even if the future is canceled it will still run in the background. But I think that is useful to mix with other control flow.

Co-Authored-By: Lucio Franco <luciofranco14@gmail.com>

text/0000-run-to-completion-async-fn.md

DemiMarie

Minor suggestions

DemiMarie · 2020-10-22T02:55:55Z

text/0000-run-to-completion-async-fn.md

+the main exception here are future types which need to run to completion, like
+wrappers around IO completion operations. These are however unsafe by nature
+and require careful implementation and review. Therefore the `unsafe` attribute
+added here is justified.


We also need to consider combinator authors. Combinators can be much more efficient in this design, but they will require unsafe code internally.

+1. Combinators and any wrapping of futures. I'd like to see a section on combinators specifically. I fear that two disjunct traits would lead to a lot of duplication in combinators.

In fact, I think that unfortunately combinators are so deeply affected that the proposal becomes infeasible in its current state:

Say a combinator like select: once returned from it, what happens to the unfinished future? Return to caller just like today? Who ensures the future is polled again before dropped?

Same for FuturesUnordered: Poll a couple of items out, but leave some inside, then drop it - compiler is none the wiser -- and the invariant is broken.

In general: A combinator that accepts completion futures must poll all of its inner futures to completion before returning from any of its async functions. It can neither store or return any incomplete futures because then they can be arbitrarily dropped.

Unfortunately, this gives us three bad options:

Combinators disallow this - severely degrading their utility.

Combinators allow this, but lose most of the guarantees of this proposal

Drop-bombs - panic at runtime if futures and combinators are not polled to completion at drop-time.

It all boils down to this existing drop-semantics we have today - whether we're in async or not. Hence, some rudimentary linear type features are necessary, as far as I can tell.

@betamos a combinator that accepts a completion future would itself be a completion future, so this problem is pushed up to the caller and ultimately to the executor.

Same for FuturesUnordered: Poll a couple of items out, but leave some inside, then drop it - compiler is none the wiser -- and the invariant is broken.

FuturesUnordered is a Stream, and a completion variant would be needed for this as well. One approach would be to instead buffer all of the futures into a single Vec before returning them.

That said, FuturesUnordered’s approach is an anti-pattern in many cases. A better API is

let (result1, result2, result3) = select!(completion_future_1, completion_future_2, completion_future_3);

I hear you, but I think if we invalidate people's existing use of combinators, we're gonna have a harder time getting this through review. There's a lot more convincing to do.

FuturesUnordered is used under the hood in many other combinators and stream transforms too..

How would you solve the standard issue of:

A TCP Listener produces a stream of Connections

For each connection concurrently:

Run a handler which returns a result. The handler is a completion future.

Collect the results out-of-order (in a buffer that never exceeds O(concurrent connections))

We could use a completion version of FuturesUnordered that itself must be polled to completion. Of course, there would need to be a method for cancellation as well.

That is basically the structured concurrency scope/TaskGroup/Nursery thing.

tokio-rs/tokio#2579 had an implementation for Rust/tokio without support for run to completion tasks. https://github.com/DougGregor/swift-evolution/blob/structured-concurrency/proposals/nnnn-structured-concurrency.md proposes a TaskGroup for Swift which is basically the same: A FuturesUnordered for run-to-completion coroutines with builtin support graceful cancellation.

And as said on zulip I don't think the current select!/join! combinators should go away. They will still have their roles in waiting e.g. for a channel and a timer in parallel.

The new combinators will focus more on the orchestration of bigger tasks. E.g. on the server example that @betamos had the TCP listener would open a task group, and start the processing of each connection as a subtask in this group. If the listener gets cancelled (e.g. via SIGINT), the subtasks also receive a graceful cancellations signal which still allows them to e.g. finish processing the current request, and then shut down.

Btw: I think the original comment from @DemiMarie was about something different. I interpret it more has "combinators which can assume run to completion guarantees can be more efficient and offer different features", than a "old combinators are no longer useful".

I fully agree with that, but it's to be seen what combinators we really need. The number 1 on my list is really the TaskGroup thing, which however could also enable borrowing from the parent tasks as a new feature (like crossbeam scopes).

DemiMarie · 2020-10-22T02:56:26Z

text/0000-run-to-completion-async-fn.md

+more similar to normal functions than current `async fn`s do. It might therefore
+have been interesting to see `#[completion] async fn` as the new "default" async
+function type - which might be the mostly used function type used by application
+writers in the end. However this is not possible anymore, since `async fn` is


It is possible in Rust 2021 😄

DemiMarie · 2020-10-22T03:00:22Z

text/0000-run-to-completion-async-fn.md

+
+A future Edition of Rust could adopt a different keyword for async completion
+functions in order to reduce the verbosity. This should be done
+after carefully studying the usage of the various `async fn` types.


Suggested change

after carefully studying the usage of the various `async fn` types.

after carefully studying the usage of the various `async fn` types.

It would also be possible to make `RunToCompletionFuture` be the default. `#[completion] async` would become just `async`, and `async` would become `#[completion(false)] async` (or similar). This transformation can be performed automatically.

It's totally possible. But I would rather not recommend this, since those kinds of semantic changes mean a lot of churn for people updating their projects, as well as might confuse people which work on projects in both editions. I therefore don't expect this to face a lot of acceptance.

betamos

Sorry I'm late on the ball. Lots of stuff in here is gold - and I have really been bothered by the same issue for a long time. +1000 we need to solve these issues.

In its current state, unfortunately I think it's not feasible due to the drop-semantics of Rust programs today.

betamos · 2021-01-08T21:40:16Z

text/0000-run-to-completion-async-fn.md

+async fn transmit_data() {
+    let buffer: String = "Data to transfer".into();
+    let data: Bytes = buffer.into();
+    let (bytes_transferred, data) = engine.send(data).await;


I'm not sure I get how this workaround solves the problem. If engine.send is still a multi-step operation you could end up in an inconsistent state by using abortable/carelessly using select and so on? To me there is no reasonable workaround today (except through docs and manually checking each call site - impossible in e.g. a library). As such, it may give the wrong impression to state that this problem can be worked around.

It mainly solves the problem of corrupting memory if the operation is aborted. With this workaround the memory is temporarily leaked while the operation continues running in the background, but the overall operation is safe.
As you mentioned, it might not solve other logic issues.

betamos · 2021-01-08T21:50:26Z

text/0000-run-to-completion-async-fn.md

+
+For certain functions it is absolutely necessary for correctness that they
+actually runs to completion.
+The reason is typically that the function forms an atomic transaction. If the


I understand the buffer use-case is more "real-world", but there's an even simpler example:

async fn commit(&mut self) { self.a = true; something_else().await; // possible exit point self.b = true; }

This method can abort at the await point and leave inconsistent data - very different from sync code. The next user of the value gets mutable access, and can't even detect the inconsistent state. Working around this "every await point is possible termination" is infeasible in practice.

Sure. Or just the pattern which is kind of common in other languages:

async fn doSth(&mut self) -> Result<(), ()> { if self.errored { return Err(()); } self.errored = true; something_else().await?; self.errored = false; Ok(()) }

There's countless variations of it. The question is which ones are most easy to understand and follow for the audience.

text/0000-run-to-completion-async-fn.md

betamos · 2021-01-08T21:57:51Z

text/0000-run-to-completion-async-fn.md

+  and still deliver a return value. It **could** however also continue to run for
+  a certain time. E.g. in order to finalize the important transaction.
+3. The issuer of the cancellation request waits for the cancelled async function
+  to return. This can e.g. be achieved through a `WaitGroup` or `Semaphore` type.


Not sure I follow why this is necessary? Normally a cancelled function would just return a cancellation error, and the caller just handles the error.

When does the cancellation error get to the caller? The answer here is: When the caller waits for the operation to complete, and notices the error.

So this step is necessary. Note that also in Rust we don't have any standardized exception base class, and therefore also can't have a standardized derived CancellationError. This means even if new primitives (e.g. a new file IO API using completion based IO) returns a CancellationError, the methods around it might need to translate it to whatever error type the next layer is expecting.

betamos · 2021-01-08T22:04:44Z

text/0000-run-to-completion-async-fn.md

+in a backwards compatible fashion. It will not break code already running
+on those runtimes.
+
+## Cancellation support for run to completion functions


Perhaps put in title "graceful cancellation". The way graceful cancellation is implemented with your proposal remains the same as it is today. I've already implemented something similar, for instance. The community would publish and converge on a pattern, perhaps even as part of Context in the future, as you say.

I'd shorten this section and state that graceful cancellation is itself not affected by the proposal, but that graceful cancellation becomes more important to get right since futures will increasingly become completion based. This is to avoid confusion that graceful cancellation must be fully resolved in order to make progress - which is not true. We already have the tools to build it!

You are fully right. We can cover graceful cancellation fully separate from this proposal. And in fact I thought about proposing a stop_token similar to the C++ one which can be used in sync as well as async contexts.

The main reason this was brought up here is to avoid questions from readers along:

Futures where great because they where so easy to cancel - how do I do the same in this new world?

If you and other feels we should move it out of this doc to focus purely on run to completion, I'm also good with it.

betamos · 2021-01-08T22:46:35Z

text/0000-run-to-completion-async-fn.md

+[guide-level-explanation]: #guide-level-explanation
+
+The proposal adds 3 new items to Rust:
+- A new `Future` type, which will reside in `core::task`. In the remains of this


nit: Trait. This had me confused until the end of this doc, where the trait was explained.

betamos · 2021-01-08T22:49:30Z

text/0000-run-to-completion-async-fn.md

+the main exception here are future types which need to run to completion, like
+wrappers around IO completion operations. These are however unsafe by nature
+and require careful implementation and review. Therefore the `unsafe` attribute
+added here is justified.


+1. Combinators and any wrapping of futures. I'd like to see a section on combinators specifically. I fear that two disjunct traits would lead to a lot of duplication in combinators.

betamos · 2021-01-08T22:54:51Z

text/0000-run-to-completion-async-fn.md

+actually be equal to the one `poll` itself, since `poll` is also expected to
+perform any final cleanup.
+
+A different alternative might be to pair run to completion methods later with


Nice to see, we came to the same conclusion. I think defer is very interesting for multi-exit control flow that we have so much of in Rust with ?.

betamos · 2021-01-08T23:00:11Z

text/0000-run-to-completion-async-fn.md

+cooperatively cancel methods with run to completion semantics also had been
+successfully deployed in these environments.
+
+# Unresolved questions


Curious about crash consistency, and how this works in the case of panics.

If we implement poll_drop, then panics could run poll_drop to completion as the stack is being unwound. C++ takes a similar approach via unhandled_exception.

betamos · 2021-01-08T23:24:15Z

text/0000-run-to-completion-async-fn.md

+the main exception here are future types which need to run to completion, like
+wrappers around IO completion operations. These are however unsafe by nature
+and require careful implementation and review. Therefore the `unsafe` attribute
+added here is justified.


In fact, I think that unfortunately combinators are so deeply affected that the proposal becomes infeasible in its current state:

Say a combinator like select: once returned from it, what happens to the unfinished future? Return to caller just like today? Who ensures the future is polled again before dropped?

Same for FuturesUnordered: Poll a couple of items out, but leave some inside, then drop it - compiler is none the wiser -- and the invariant is broken.

In general: A combinator that accepts completion futures must poll all of its inner futures to completion before returning from any of its async functions. It can neither store or return any incomplete futures because then they can be arbitrarily dropped.

Unfortunately, this gives us three bad options:

Combinators disallow this - severely degrading their utility.

Combinators allow this, but lose most of the guarantees of this proposal

Drop-bombs - panic at runtime if futures and combinators are not polled to completion at drop-time.

It all boils down to this existing drop-semantics we have today - whether we're in async or not. Hence, some rudimentary linear type features are necessary, as far as I can tell.

DemiMarie · 2021-01-08T23:39:33Z

text/0000-run-to-completion-async-fn.md

+the main exception here are future types which need to run to completion, like
+wrappers around IO completion operations. These are however unsafe by nature
+and require careful implementation and review. Therefore the `unsafe` attribute
+added here is justified.


@betamos a combinator that accepts a completion future would itself be a completion future, so this problem is pushed up to the caller and ultimately to the executor.

Same for FuturesUnordered: Poll a couple of items out, but leave some inside, then drop it - compiler is none the wiser -- and the invariant is broken.

FuturesUnordered is a Stream, and a completion variant would be needed for this as well. One approach would be to instead buffer all of the futures into a single Vec before returning them.

That said, FuturesUnordered’s approach is an anti-pattern in many cases. A better API is

let (result1, result2, result3) = select!(completion_future_1, completion_future_2, completion_future_3);

text/0000-run-to-completion-async-fn.md

DemiMarie · 2021-01-08T23:54:53Z

text/0000-run-to-completion-async-fn.md

+    /// Callers are not allowed `drop()` a future which returned `Pending` as
+    /// its last poll result. Futures are only allowed to be dropped if they
+    /// either had never been polled, or if the last `poll()` call returned `Ready`.
+    unsafe fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;


A supertrait of Future would be fine; any Future is trivially a RunToCompletionFuture.

Panics are a bigger problem. There are a few solutions I can think of:

Poll drop: whenever a panic happens in async contexts, any objects on the stack are dropped asynchronously. The actual panic doesn’t propagate until all the poll_drop methods have returned Poll::Ready. I believe this is how try-finally works in async ECMAScript functions. Since the only way to call RunToCompletionFuture::poll without using unsafe is via .await, this should work.

Premature drop calls abort: leaking a polled RunToCompletionFuture would still be undefined behavior, but dropping it would call abort instead.

DemiMarie · 2021-01-08T23:59:22Z

text/0000-run-to-completion-async-fn.md

+
+Alternative proposals, like the one in
+https://internals.rust-lang.org/t/pre-pre-rfc-unsafe-futures, also proposed to
+directly add a `.cancel()` method on the new `Future` type which needs to initiate


As mentioned above, I believe that robust handling of panic in #[completion] async fns requires a .poll_drop() that is invoked by the compiler in such cases. This is effectively a .cancel() method.

Not saying this is a bad idea at all - but cancel and drop are completely different in my mind. Since drop is on the panic path, it simply attempts a best effort cleanup (dropping file descriptors), and protects against memory unsafety.

Cancel OTOH is highly cooperative - and on the expected path. Say you have a search problem where three tasks race to find a solution. Once one is found, the other two are cancelled - gracefully, and awaited as normal (their return values would be Err(Canceled) and discarded).

Good point. I suspect the implementations will often coincide, but that is by no means guaranteed.

DemiMarie · 2021-01-09T00:07:33Z

text/0000-run-to-completion-async-fn.md

+trait to use it: The `unsafe` contract in the trait could require callers to
+either `.poll()` the future to completion, or to switch over to calling `poll_drop`.
+
+However the benefit seems low. By doing this the Future would again move away


Panic handling seems to require poll_drop or similar, unless one wants to force panic=abort for async code.

I think I agree. The issue I find most hard is during parallelism, i.e. real threads. This occurs in multithreaded async runtimes, but also without async in regular threads. Crossbeam's scoped threads catch the panic and waits for the threads to terminate. Which can wait forever. It's somewhat terrifying that panic propagation is unbounded in time, but similarly terrifying to not be able to statically borrow between threads.. I have no good answers here - just sharing my thoughts.

Panic propagation is already unbounded in time, since we have Turing-complete destructors. What this does mean is that we need std::panic::catch_unwind to be extremely fast, ideally free.

DemiMarie · 2021-01-09T00:09:34Z

text/0000-run-to-completion-async-fn.md

+cooperatively cancel methods with run to completion semantics also had been
+successfully deployed in these environments.
+
+# Unresolved questions


If we implement poll_drop, then panics could run poll_drop to completion as the stack is being unwound. C++ takes a similar approach via unhandled_exception.

Co-authored-by: Demi Marie Obenour <demiobenour@gmail.com>

Matthias247 · 2021-01-09T03:49:56Z

text/0000-run-to-completion-async-fn.md

+
+For certain functions it is absolutely necessary for correctness that they
+actually runs to completion.
+The reason is typically that the function forms an atomic transaction. If the


Sure. Or just the pattern which is kind of common in other languages:

async fn doSth(&mut self) -> Result<(), ()> { if self.errored { return Err(()); } self.errored = true; something_else().await?; self.errored = false; Ok(()) }

There's countless variations of it. The question is which ones are most easy to understand and follow for the audience.

Matthias247 · 2021-01-09T03:52:10Z

text/0000-run-to-completion-async-fn.md

+in a backwards compatible fashion. It will not break code already running
+on those runtimes.
+
+## Cancellation support for run to completion functions


You are fully right. We can cover graceful cancellation fully separate from this proposal. And in fact I thought about proposing a stop_token similar to the C++ one which can be used in sync as well as async contexts.

The main reason this was brought up here is to avoid questions from readers along:

Futures where great because they where so easy to cancel - how do I do the same in this new world?

If you and other feels we should move it out of this doc to focus purely on run to completion, I'm also good with it.

Matthias247 · 2021-01-09T03:55:39Z

text/0000-run-to-completion-async-fn.md

+  and still deliver a return value. It **could** however also continue to run for
+  a certain time. E.g. in order to finalize the important transaction.
+3. The issuer of the cancellation request waits for the cancelled async function
+  to return. This can e.g. be achieved through a `WaitGroup` or `Semaphore` type.


When does the cancellation error get to the caller? The answer here is: When the caller waits for the operation to complete, and notices the error.

So this step is necessary. Note that also in Rust we don't have any standardized exception base class, and therefore also can't have a standardized derived CancellationError. This means even if new primitives (e.g. a new file IO API using completion based IO) returns a CancellationError, the methods around it might need to translate it to whatever error type the next layer is expecting.

text/0000-run-to-completion-async-fn.md

Matthias247 · 2021-01-09T04:04:42Z

text/0000-run-to-completion-async-fn.md

+    /// Callers are not allowed `drop()` a future which returned `Pending` as
+    /// its last poll result. Futures are only allowed to be dropped if they
+    /// either had never been polled, or if the last `poll()` call returned `Ready`.
+    unsafe fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;


I'm not sure whether there is a big difference between requiring the callers to call poll_drop after a panic, or simply continue to call poll. Since everything that the compiler puts into poll_drop could be in the same way in poll.

Maybe one of the arguments for a dedicated poll_drop is that it doesn't have the output value that poll has.
I think I would be ok with adding a poll_drop to the new Future - but more for pure compiler-generated drop glue than for humans to implement.

Premature drop calls abort: leaking a polled RunToCompletionFuture would still be undefined behavior, but dropping it would call abort instead.

Yeah, that's the "simple way out". For my use-cases I would be rather fine with it, since I think panics + coroutines/futures is a super indeterministic thing anyway. But if others rely on the ability to handle panics in async run to completion contexts, it should indeed be taken into account.

Is this implication always true? I buy that the consistency and result of the function is inconsistent and "poisoned", but memory safety sounds like it would only apply in certain cases, such as the subtask-example.

It doesn't matter whether its 1 example or thousands. If it's more than 0 examples it has to be labeled as unsafe in Rust. People rely on safe functions really being safe - even under heavy API misuse. That's why there also had been very unpleasant discussions for authors of crates which provided support for completion based IO. Those had APIs which very basically always safe, as long as one doesn't do really weird things with them. But still they got push-back from the community about not marking them unsafe because theoretically those could lead to memory issues.

Matthias247 · 2021-01-09T04:14:52Z

text/0000-run-to-completion-async-fn.md

+the main exception here are future types which need to run to completion, like
+wrappers around IO completion operations. These are however unsafe by nature
+and require careful implementation and review. Therefore the `unsafe` attribute
+added here is justified.


That is basically the structured concurrency scope/TaskGroup/Nursery thing.

tokio-rs/tokio#2579 had an implementation for Rust/tokio without support for run to completion tasks. https://github.com/DougGregor/swift-evolution/blob/structured-concurrency/proposals/nnnn-structured-concurrency.md proposes a TaskGroup for Swift which is basically the same: A FuturesUnordered for run-to-completion coroutines with builtin support graceful cancellation.

And as said on zulip I don't think the current select!/join! combinators should go away. They will still have their roles in waiting e.g. for a channel and a timer in parallel.

The new combinators will focus more on the orchestration of bigger tasks. E.g. on the server example that @betamos had the TCP listener would open a task group, and start the processing of each connection as a subtask in this group. If the listener gets cancelled (e.g. via SIGINT), the subtasks also receive a graceful cancellations signal which still allows them to e.g. finish processing the current request, and then shut down.

Matthias247 · 2021-01-09T04:19:13Z

text/0000-run-to-completion-async-fn.md

+the main exception here are future types which need to run to completion, like
+wrappers around IO completion operations. These are however unsafe by nature
+and require careful implementation and review. Therefore the `unsafe` attribute
+added here is justified.


Btw: I think the original comment from @DemiMarie was about something different. I interpret it more has "combinators which can assume run to completion guarantees can be more efficient and offer different features", than a "old combinators are no longer useful".

I fully agree with that, but it's to be seen what combinators we really need. The number 1 on my list is really the TaskGroup thing, which however could also enable borrowing from the parent tasks as a new feature (like crossbeam scopes).

SabrinaJewson · 2021-01-17T12:25:42Z

I have published a crate based off the ideas in this RFC - completion. A couple things to note:

I did not make CompletionFuture a supertrait of Future. Doing so would prevent implementing CompletionFuture for Box<S: CompletionFuture> as it would overlap with the Future for Box<S: Future> implementation. It would also prevent implementing Future for wrappers whose inner type also implements Future, which could be useful.
If the completion future panics, I allowed it to be dropped. I chose this because it simplifies implementation and it should be up to the I/O futures to not panic as part of their soundness. Types that poll multiple futures at once (like Zip) are implemented with catch_unwind.

Matthias247 · 2021-01-17T19:30:56Z

I have published a crate based off the ideas in this RFC - completion. A couple things to note:

Wow, that's pretty cool! Thanks for investing all the efforts to dive into this - it looks pretty immense! I did not think about using proc-macros to polyfill it, and wasn't even sure it would be possible.

How do you allow to await CompletionFutures only in completion functions? Is it using some::path::__FutureOrCompletionFuture(fut1).__into_future_unsafe().await;? So it wraps every future in a completion block in this kind of wrapper future, and polls it that way? I guess that might work, but you might need to bubble the unsafe attribute a bit higher up to be fully correct.

One thing I noticed is that you also implemented already the whole ecosystem around it (combinators, completion streams, completion async trait, etc). I think that's great in order to figure out whether there is a big oversight or blocker. Did you experience anything like this?
I however don't think we should look into stabilizing all those things at once. As we learned from the normal work on futures, it's a long process to get concensus on all of those features.

I did not make CompletionFuture a supertrait of Future. Doing so would prevent implementing CompletionFuture for Box<S: CompletionFuture> as it would overlap with the Future for Box<S: Future> implementation. It would also prevent implementing Future for wrappers whose inner type also implements Future, which could be useful.

Thanks. Those are good arguments for using blanket impls instead of using a supertrait.

If the completion future panics, I allowed it to be dropped. I chose this because it simplifies implementation and it should be up to the I/O futures to not panic as part of their soundness. Types that poll multiple futures at once (like Zip) are implemented with catch_unwind.

The panic behavior (as also pointed out by @DemiMarie and @betamos ) is definitely something that needs a bit more thought. I think the overall pitch needs to be "just works like a synchronous function", and this should also apply to panics. If a panic happens inside the function, it can directly return from there. The main tricky part is "what happens if a panic happens more towards the root of a task/await stack" - e.g. in the stack of a TaskGroup which manages multiple completion task. Those still need to run to completion, and therefore the panic can't immediatel tear everything down. It needs to continue to poll those inner tasks. In order to do this, the CompletionFuture which represents the TaskGroups scope might need to have a poll_drop method which is called in that path - or there is just a flag set and the next layer continues to call the normal poll function. Maybe this part also needs to be prototyped to get a better understanding what is required.

Another reason for a requirement to add poll_drop could be the addition of a defer function for both sync and async function, which could guarantee that code is even executed after panics - which seems related to what @betamos is looking for. That code path could then be executed inside the poll_drop function. However I am speculating the compiler might also be able to move all that code into the poll method.

SabrinaJewson · 2021-01-17T20:42:55Z

How do you allow to await CompletionFutures only in completion functions? Is it using some::path::__FutureOrCompletionFuture(fut1).__into_future_unsafe().await;? So it wraps every future in a completion block in this kind of wrapper future, and polls it that way? I guess that might work, but you might need to bubble the unsafe attribute a bit higher up to be fully correct.

This is explained more in src/macros/future.rs. It uses inherent method-based specialization: __FutureOrCompletionFuture<F: Future> has a no-op method __into_future_unsafe, and additionally there is a trait that provides a method __into_future_unsafe for __FutureOrCompletionFuture<F> which wraps it in an AssertCompletes. I'm not sure what you mean by bubbling the unsafe, but it's sound because that transformation is only applied to top-level .awaits inside the macro, and so the containing async block is guaranteed to complete by the macro.

I think that's great in order to figure out whether there is a big oversight or blocker. Did you experience anything like this?

There was nothing major I experienced (assuming my code is sound 😆). Most of the implementations are identical to ones in futures, but with CompletionFuture instead of Future.

I however don't think we should look into stabilizing all those things at once. As we learned from the normal work on futures, it's a long process to get concensus on all of those features.

The majority of the features I implemented were either already implemented in the standard library, futures, futures-lite or Tokio - I haven't been particularly adventurous with it. My only original creations are AssertCompletes and MustComplete, the I/O traits, ReadBufMut, and io::TakeUntil (just because it made implementing read_until and read_line easier). But yes, I'm not looking to stabilize any of this any time soon.

With regards to the panics, I have a suggestion. First, when poll panics the future is allowed to be dropped. Second, we add a new method to CompletionFuture:

fn cancel(self: Pin<&mut Self>);

All it will do is suggest to the future that it might want to cancel its operation. There are a couple reasons I think this is a good idea:

It plays excellently with APIs like io_uring. The cancel method will simply send an async cancel request with the user data that the future already knows about.
It avoids the complexity of multiple methods and several phases of the future. The future will only have one phase - running - and it can only complete in one way.
It provides a way for dealing with cancellation in general.

I'm unsure about the details of this however - should it have an empty default implementation or not? And how would it work with async blocks?

With regards to defer, I consider that somewhat orthogonal. If async destructors get added to the language they'll be run on panic via the panicking mechanism, and there's no difference there when compared to regular futures.

Matthias247 · 2021-01-17T21:19:16Z

With regards to the panics, I have a suggestion. First, when poll panics the future is allowed to be dropped. Second, we add a new method to CompletionFuture:

This is discussed in the section Omission of cancellation in the new RunToCompletionFuture type of the RFC, and the rationale for leaving it out is pointed out (it seems like it can be done on application level - and a suitable StopToken/CancellationToken can be standardized orthogonal to this effort, or just left in a foundational library).

If panic needs to call cancel this obviously won't work, since the compiler isn't aware about the token. But I'm not sure yet whether it needs to, or whether the panic handler in a higher level future (like a TaskGroup) would simply need to be aware of it's application-defined cancel token, cancel that, and then poll futures inside it to completion independent of how long it takes.

SabrinaJewson · 2021-01-18T17:47:42Z

My main problem with CancellationToken-based cancellation is that I would really like these two lines of code to just work:

select(a.write(data), b.write(data)).await;
select(#[completion] async { a.write(data).await }, #[completion] async { b.write(data).await }).await;

Having to using tokens just feels too verbose:

select(|token| a.write(data, token), |token| b.write(data, token)).await;
select(
    |token| #[completion] async { a.write(data, token).await },
    |token| #[completion] async { b.write(data, token).await },
).await;

And especially in a function like join/zip where a cancellation token has to be passed in case of panics, the user is forced to write out the whole closure just to support the really minor edge case of panicking. So while you're right that it can be done at the application level, it would be a lot more ergonomic to do it at this level.

This is why I now think that cancellation should be built-in to the CompletionFuture trait, because otherwise functions like join just become a pain to use. For compiler-generated #[completion] async block futures, this would be easy to implement - cancel would call cancel on the currently .awaited future and set a flag in the generated future, then when the .awaited future completes the outer future will exit. This method of cancellation could also be implemented in my macro.

Edit: Actually, on further thought my method would require having fn poll_cancel(self: Pin<&mut Self>, cx: Context<'_>) -> Poll<()>; because otherwise cancelled futures would be forced to return a value. I didn't really want to do that because of the complexity of having multiple states and having to call them in a specific order, I wonder if there's a better way.

SabrinaJewson · 2021-01-25T17:04:46Z

I would like to decide on a solution for cancellation so I can implement it in my library - @Matthias247, what do you think of my poll_cancel approach?

Matthias247 · 2021-01-30T05:18:03Z

@KaiJewson sorry, currently also on a few other projects, therefore i didn’t have more time to dig into this yet.

I think for normal cancellation - initiated by other code path in the application - this approach won’t work. The cancellation can happen concurrently while the future is executing. Which means a mutable reference to it won’t be available.

I think a cancel method which takes a non mutable reference that inits the cancellation, plus another method that polls to completion can work.

But since that is equivalent to just passing a cancellation/stop token, and waiting with the normal poll method, i don’t think there is any benefit in this. The only use case I can currently see is regarding handling panics of the super task, and I didn’t think about it more in detail yet.

SabrinaJewson · 2021-01-30T09:29:50Z

I think for normal cancellation - initiated by other code path in the application - this approach won’t work.

Is this a common use case? In the current async ecosystem, I think that most premature dropping of futures occurs in select or when one of the futures in join panics. Personally I've never remotely cancelled a future.

I think a cancel method which takes a non mutable reference that inits the cancellation, plus another method that polls to completion can work.

If the poller has a mutable reference to it the canceller won't be able to get a shared reference either, so I don't think that this would have any advantages over poll_cancel for remote cancellation.

SabrinaJewson · 2021-02-04T07:50:43Z

An idea just occured to me. Currently, I/O futures typically register their provided Wakers in the reactor. However, since they can be forgotten at any time, it is necessary to make sure the wakers they use have the ability to outlive the future, so that use-after-frees do not occur. But now that we have completion futures, I/O futures have the ability to utilize a waker that only lives for as long as the future itself - so in theory we don't even need an Arc to store wakers.

The way this could be supported is by (1) adding a new unsafe method raw_clone to Waker that simply copies over the pointer and vtable pointer to a new Waker and (2) enforcing that when raw_clone is called on the waker passed to CompletionFuture::poll, that waker must be valid as long as the future exists. Callers can then use lazily-allocated wakers, which will probably look like this:

struct LazyWaker {
    heap: OnceCell<Arc<WakerState>>,
    local: WakerState,
}

So callers simply need to store a LazyWaker that lives for as long as the future. This system would mean that in the common case, no allocations are needed at all.

Edit: After more thinking, I realized that a raw_clone-based method wouldn't work because destructors would be called twice. Instead, the API could look something like this:

#[derive(Debug, Clone, Copy)]
struct WakerRef<'a> { ... }
impl WakerRef<'_> {
    pub fn wake(self) { ... }
}
impl Waker {
    pub fn as_ref(&self) -> WakerRef<'_> { ... }
}

Completion futures would then be given the guarantee that if they transmute WakerRef to a longer lifetime it is guaranteed to be valid as long as the future is running.

Edit 2: I have now experimentally implemented this in completion, and with it I wrote a zero-allocation implementation of block_on.

Edit 3: Turns out, at least for block_on, this method is slower. Additionally it adds a lot of complexity to implementations of things like executors mostly because it has to handle both the case of the waker being cloned and the waker not being cloned. So even though this could in theory provide performance advantages, I'm going to give up on this. Sorry for writing a page of text for nothing :D

Venryx · 2022-04-07T23:41:26Z

Just wanted to mention that, as a newcomer to Rust, this behavior of futures being "silently cancelled" was very unintuitive to me, and required several hours of (confused) debugging for me to diagnose.

A simplified version of what perplexed me is below:

pub async fn test1() {
    let mtx = Mtx {};

    // === [part 1/4]: this ran
    println!("1");

    tokio::time::sleep(Duration::from_millis(5000)).await;
    
    // === [part 2/4]: but this didn't run
    println!("2");

    mtx.end();
}

pub struct Mtx {}
impl Mtx {
    pub fn end(&self) {
        // === [part 3/4]: this didn't run either
        println!("Mtx::end called.");
    }
}
impl Drop for Mtx {
    fn drop(&mut self) {
        // === [part 4/4]: and yet somehow, this *did* run, despite parts 2 and 3 above not running
        println!("Mtx::drop called.");
    }
}

Ultimately, the issue seems to have been that my GraphQL server library async-graphql was dropping/canceling the futures for a query/subscription resolver when it saw that the client disconnected or canceled its subscription.

This led to confusion for me though, because I kept looking at my logs and wondering how the mtx struct was getting dropped without the println!("2") line executing. I eventually found this pull-request by searching for "rust await drop", and what it describes helped me identify what the likely underlying cause is.

All that to say: This feature would be very useful to me! (to ensure that in the futures I write, I can rely on the lines in my functions all executing, rather than cancelling at arbitrary points mid-way; I can see the usefulness of that in some cases, but in others it can be confusing and/or cause inconsistent states!)

In the meantime, is there any way to:

Somehow wrap or modify my async function to ensure it always runs to completion? (I noticed the completion crate mentioned above, but from what I can tell, you can only use that if you control the "root" async function, such that you can modify it to return CompletionFuture rather than Future?)
If no. 1 isn't possible yet, is there some way to "detect when a future is cancelled mid way", and immediately log a warning? (to at least help avoid the case of trying to debug an async function's execution, only to find that the issue was due to it getting canceled in the middle)

DemiMarie · 2022-04-08T00:09:16Z

Not sure.

…e the write-lock acquisition was attempted prior to the read lock being dropped. * Slightly improved formatting for error stack-traces. * Added a new `Result::expect_lazy` function -- like `except`, but with a closure passed rather than a string. * Fixed a couple mistakes in lq_batch. (batching system still not working yet, but making progress) Note: The deadlock above took a long time to find, due to my not realizing that Rust futures can be "canceled" in the middle of a function -- causing the structs in the scope to be dropped, despite the scope's statements not having all completed. See here for more info: Matthias247/rfcs#1 (comment)

Venryx · 2022-04-09T18:15:03Z

For my first question, I've found a solution: use tokio::spawn within the async function, to create an independent future that will run to completion, and then just await the result from that "inner future".

Example:

async fn func_returning_future_that_might_be_dropped(param: Arc<MyParam>) -> MyData {
    let param_clone = param.clone();
    let result = tokio::spawn(async move {
        let my_data = get_my_data(param_clone).await;
        my_data
    }).await.unwrap();
    result
}

For this approach to work, apparently every variable that gets captured by the inner-scope must be "owned", such that tokio::spawn can capture ownership into itself, to satisfy the 'static constraint that tokio::spawn requires.

In my case, this was not too difficult, because the data I needed passed in was already cloneable (eg. by using Arc). If that weren't the case, restructuring the program to satisfy that constraint could work, but would be inconvenient. So I will likely raise an issue in the async-graphql repo, as this seems like a pretty serious foot-gun (that the async functions used for the graphql resolvers can get terminated mid-function at any .await call).

… resolvers, to prevent the async functions from getting terminated at arbitrary `.await` points -- as would happen when the outer async-graphql future gets abruptly dropped/terminated. (see: Matthias247/rfcs#1 (comment)) * With the change above, the query-batching system finally works! (albeit slowly atm, due to lock contention)

Matthias247 · 2022-04-18T06:01:09Z

Hi @Venryx . Thanks for your experience report. If you find the feature useful, it might be best if you can write your experience report at https://rust-lang.zulipchat.com/#narrow/stream/187312-wg-async where the async working group is reading.

Btw: Note that your workaround is only correct if the tokio runtime that is enclosing the code doesn't get dropped. If it would, tasks could still not run to completion. This is also one of the reasons why the 'static lifetime is necessary.

If no. 1 isn't possible yet, is there some way to "detect when a future is cancelled mid way", and immediately log a warning?

This should be possible btw. You could wrap your Future in a WarningFuture type with a custom .poll() method and a drop implemention which emits the warning if Poll::Ready has not been returned. But you will need to make sure you don't miss wrapping all Futures with that one.

npuichigo · 2023-08-29T09:18:47Z

@Matthias247 The new design has some similarity with C++'s async abstraction proposal P2300. A sender (lazy future) must run to completion with one of the signals passing to a receiver (callback).

set_value, which signals successful completion of the operation its execution depends on;
set_error, which signals that an error has happened during scheduling of the current work, executing the current work, or at some earlier point in the sender chain; and
set_stopped, which signals that the operation completed without succeeding (set_value) and without failing (set_error). This result is often used to indicate that the operation stopped early, typically because it was asked to do so because the result is no longer needed.

You can schedule your task multiple times (like poll), but at the end, it must call set_value/set_error/set_stopped to indicate the completion.

DemiMarie

I strongly recommend going ahead and submitting an RFC at this point.

DemiMarie · 2023-08-29T14:23:09Z

text/0000-run-to-completion-async-fn.md

+
+The proposal adds 2 new items to Rust:
+- A new `Future` type, which will reside in `core::task`. In the remains of this
+  document we will refer to this type as `RunToCompletionFuture`.


I agree that CompletionFuture is more concise.

npuichigo · 2023-09-06T13:22:48Z

Any update on this?

Add support for async functions which run to completion

d595c7c

This RFC introduces a new type of asynchronous functions, which have run-to-completion semantics compared to cancel-anytime semantics.

Matthias247 force-pushed the completion_fn_pr branch from 84eb82a to d595c7c Compare March 22, 2020 19:56

LucioFranco reviewed Mar 26, 2020

View reviewed changes

Apply suggestions from code review

8e7d80a

Co-Authored-By: Lucio Franco <luciofranco14@gmail.com>

Matthias247 commented Mar 27, 2020

View reviewed changes

text/0000-run-to-completion-async-fn.md Outdated Show resolved Hide resolved

Update text/0000-run-to-completion-async-fn.md

b893de0

Matthias247 commented Mar 27, 2020

View reviewed changes

text/0000-run-to-completion-async-fn.md Outdated Show resolved Hide resolved

Matthias247 added 4 commits March 26, 2020 20:45

Update text/0000-run-to-completion-async-fn.md

6f197a2

Use a an attribute instead of a keyword

4a1f4fd

Formatting

87eb552

Extent future possibilities section

d552b3f

stanislav-tkach reviewed May 5, 2020

View reviewed changes

text/0000-run-to-completion-async-fn.md Outdated Show resolved Hide resolved

Matthias247 added 3 commits July 18, 2020 13:44

Add more clarifications

2635b2a

More clarifications

a9b4c21

Add some feedback around syntax

19ed282

DemiMarie reviewed Oct 22, 2020

View reviewed changes

Describe why cancellation isn't part of the proposal

8a5e61f

betamos reviewed Jan 8, 2021

View reviewed changes

DemiMarie reviewed Jan 9, 2021

View reviewed changes

Update text/0000-run-to-completion-async-fn.md

c075022

Co-authored-by: Demi Marie Obenour <demiobenour@gmail.com>

Matthias247 commented Jan 9, 2021

View reviewed changes

Update text/0000-run-to-completion-async-fn.md

e7fd704

DemiMarie reviewed Aug 29, 2023

View reviewed changes

	document we will refer to this type as `RunToCompletionFuture`.
	document we will refer to this type as `CompletionFuture`.

Add support for async functions which run to completion #1

Are you sure you want to change the base?

Add support for async functions which run to completion #1

Conversation

Matthias247 commented Mar 22, 2020

LucioFranco left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DemiMarie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

betamos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

betamos Jan 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

betamos Jan 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SabrinaJewson commented Jan 17, 2021

Matthias247 commented Jan 17, 2021

SabrinaJewson commented Jan 17, 2021

Matthias247 commented Jan 17, 2021

SabrinaJewson commented Jan 18, 2021 • edited Loading

SabrinaJewson commented Jan 25, 2021

Matthias247 commented Jan 30, 2021

SabrinaJewson commented Jan 30, 2021

SabrinaJewson commented Feb 4, 2021 • edited Loading

Venryx commented Apr 7, 2022 • edited Loading

DemiMarie commented Apr 8, 2022

Venryx commented Apr 9, 2022

Matthias247 commented Apr 18, 2022

npuichigo commented Aug 29, 2023 • edited Loading

DemiMarie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

npuichigo commented Sep 6, 2023

betamos Jan 9, 2021 •

edited

Loading

betamos Jan 9, 2021 •

edited

Loading

SabrinaJewson commented Jan 18, 2021 •

edited

Loading

SabrinaJewson commented Feb 4, 2021 •

edited

Loading

Venryx commented Apr 7, 2022 •

edited

Loading

npuichigo commented Aug 29, 2023 •

edited

Loading