-
Notifications
You must be signed in to change notification settings - Fork 635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Footgun lurking in FuturesUnordered
and other concurrency-enabling streams
#2387
Comments
It seems to me that the right solution is to spawn the futures, and the problem involves the |
I'm told that https://tokio.rs/blog/2020-04-preemption is relevant. |
The Tokio preemption thing is related, but does not fix this issue. Preemption in Tokio is designed to ensure that a Task returns to the executor every so often (and thus can be rescheduled so that another Task can run); it does not help with starvation inside a Task, since there's nothing that can return to polling the other futures in a Task if the task is only interested in polling a different future. In this case, even if every future in the program takes part in Tokio pre-emption, you can still starve the futures in a sub-scheduler like I am strongly inclined towards the docs fix for this (making sure that the non-obvious problem is called out), because I can't see any way to keep the parent future moving that doesn't fall foul of the completion futures issue. But note that if there was a way to have a |
Is there a reason why you can't make the futures that you need to spawn |
I've not dug into the technical details but I'll note that this would make a great status quo issue, so I opened rust-lang/wg-async#131 |
For now, most of our futures are It's more that we keep finding different variations on this surprise in the Mononoke codebase (one with Tokio semaphores, one with a C++ async FFI, one with just Rust code), which implies that it's not completely obvious that this starvation issue can happen. |
This is also somewhat related to #2053. |
(NOT A CONTRIBUTION) I know this is an old issue but I think there's another perspective on this than what I've seen on the thread so far: the problem is not with FuturesUnordered, the problem is with
The use of the buffered abstraction perfectly makes these This is why There's real issues here about how to make this less foot-gunny, but it seems very odd to me to jump to the idea that these should have been spawned, when |
@withoutboats How would I use This wasn't an unusual pattern in the system I reduced this from (I no longer work there, so can't get you real examples); we had something equivalent to |
(NOT A CONTRIBUTION) I see. That is a more complicated pattern than I initially thought. But the solution is to buffer the results (rather than the futures) and then concurrently run a single do_giant_work with filling that buffer up the limit. I'm not sure off the top of my head how the combinators on StreamExt give affordances for that, but I still don't think the solution is necessarily spawning new tasks. And I think the open question is how to adjust the APIs for streams to give better affordances for expressing exactly the concurrency you want, which I would agree they don't do super well right now. Basically, you only need to spawn new tasks if you're trying to achieve parallelism, not concurrency. If do_giant_work took a lot of CPU time and not just clock time you should do something like spawn_blocking, in which case the data it takes would need to be EDIT: A channel is probably the right primitive for buffering the results, then you need to select between a |
The pattern of concurrency I wanted was to have up to 5
The very root of the problem is that If we could design a way to say something like This then functions like the |
(NOT A CONTRIBUTION)
No, this is not right. Buffered doesn't store the results of the the futures as they complete. That's exactly the problem this issue is describing, because that's why you can't keep driving those futures concurrently with If you want that, you need an additional queue of results somewhere, that's what the channel would enable. But Buffered has no internal buffer of results - it's just a buffer of the futures themselves. Probably it is poorly named for this reason. |
BinaryHeap to allow them to be reordered if they complete out-of-order. If we were able to poll it telling it to hang onto its results, it could fill that BinaryHeap with results until it had no futures in flight.
And the thing that makes Perhaps if |
(NOT A CONTRIBUTION)
Sorry, you're right, I was thinking of BufferUnordered. But it only fills that buffer as futures complete out of order; once the first future finishes it will return it and not keep it in queue. There's no way to poll it without taking the first result to finish out of the queue. But I think overall, we're saying the same things back and forth to one another:
Because this is exactly what I mean. That's the affordance the API is missing. I really responded to this issue because I agree that the current stream API is "footgunny" (and so is select for that matter) but I strongly don't think the problem is the lack of parallel scoped tasks - I'm really responding to @pcwalton's comment about this being a completionfuture problem more than anything. I think a structured concurrency scoped task API (without parallelism, like moro) might be a good tool for a better API though. Then you could spawn the stream of do_selects and the do_giant_work loop. This would need to be paired with a rethought set of stream combinators, probably. |
An alternative but simpler and more common, this makes it easy to let people know what is happened: use std::time::{Duration, Instant};
use async_io::{block_on, Timer};
use async_stream::stream;
use futures::StreamExt;
use futures_lite::future::yield_now;
fn main() {
let result = block_on(async {
stream! {
loop {
yield async {
let start = Instant::now();
yield_now().await; // future should be waken immediately
Instant::now().duration_since(start) // it must takes a very short time
};
}
}
.buffered(5)
.take(5)
.then(|d| async move {
Timer::after(Duration::from_millis(500)).await;
d
})
.collect::<Vec<_>>()
.await
});
dbg!(result);
// [examples/buffered_stream.rs:35:5] result = [
// 612.875µs,
// 501.832917ms,
// 1.002531209s,
// 1.503673417s,
// 2.004864417s, <---- ???
// ]
} |
there could be possible improvements by not using join_all, due to the issue mentioned in rust-lang/futures-rs#2387. However, I doubt there is much speedup to be extracted, since most of the bottleneck are the requests.
there could be possible improvements by not using join_all, due to the issue mentioned in rust-lang/futures-rs#2387. However, I doubt there is much speedup to be extracted, since most of the bottleneck are the requests.
I've run into the same problem, where I mistakenly expected a rust Fundamentally, it seems like the Stream trait seems to be missing a poll upstream buffers method (e.g. poll_upstream or poll_buffered), so that outer structs (downstream) can use that on their inner struct (upstream) when they aren't ready for another item through poll_next. That can then propagate all the way upstream if it is implemented in all the Stream implementations in the chain. Introducing a required method on the trait like this would be a breaking change, but it could be first introduced as a method on the trait with a default implementation, although making it required would be needed to actually avoid this being a footgun and make it behave more like its namesake. After that, it would be useful to document the fact that the stream doesn't work like a pipeline without any buffer between steps. E.g. two steps chained together that each await for 100ms would benefit from a buffer of size 1 between them for them to operate independently on different items. The ability to poll upstream buffers could allow a convenience library to do this automatically, but a buffer wouldn't be needed between each step when multiple quick steps are followed by a slow step, just one preceding the slow step would be useful. |
We've found a nasty footgun when we use
FuturesUnordered
(orbuffered
etc) to get concurrency from a set of futures.Because
FuturesUnordered
only polls its contents when it is polled, it is possible for futures lurking in the queue to be surprised by a long poll, even though no individual future spends a long time inpoll()
. This causes issues in two cases:while let Some(res) = stream.next().await
and then do significant wall-clock time inside the loop (even if very little CPU time is involved because you're awaiting another network service), you can hit the external system's timeouts and fail unexpectedly.FuturesUnordered
owning all the semaphores, while having an item in a.for_each()
block afterbuffer_unordered()
requiring a semaphore.https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f58e77ba077b40eba40636a4e32b5710 shows the effect. Naïvely, you'd expect all the 10 to 20ms sleep futures to complete in under 40ms, and the 100ms sleep futures to take 100ms to 200ms. However, you can see that the sleep threads complete in the timescale expected and send the wakeup to the future that spawned them, but some of the short async
sleep_for
futures take over 100ms to complete, because while the thread signals them to wake up, the loop is.await
ing a long sleep future and does not get round to polling the stream again for some time.We've found this in practice with things where the loop body is "nice" in the sense that it doesn't run for very long inside its
poll
function, but the total time spent in the loop body is large. The futures being polled byFuturesUnordered
do:and the main work looks like:
do_giant_work
can take 20 seconds wall clock time for big work items. It's possible forget_conn
to open the connection (which has a 10 second idle timeout) for each Future in thebuffered
set, send the first handshake packet, and then returnPoll::Pending
as it waits for the reply. When the first of the 5 in thebuffered
set returnsPoll::Ready(item)
, the code then runsdo_giant_work
which takes 20 seconds. Whiledo_giant_work
is in control, nothing re-polls thebuffered
set of Futures, and so the idle timeout kicks in server-side, and all of the 4 open connections get dropped because we've opened a connection and then not completed the handshake.We can mitigate the problem by using
spawn_with_handle
to ensure that thedo_select
work happens whenever thedo_giant_work
Future awaits something, but this behaviour has surprised my team more than once (despite enough experience to diagnose this after the fact).I'm not sure that a perfect technical solution is possible; the issue is that
FuturesUnordered
is a sub-executor driven by the main executor, and if not polled, it can't poll its set of pending futures. Meanwhile, the external code is under no obligation to poll theFuturesUnordered
in a timely fashion. Spawning the futures before putting them in the sub-executor works because the main executor then drives them, and the sub-executor is merely picking up final results, but futures have to be'static
lifetime to be spawned.I see two routes from here to a better place:
select!
andselect_biased!
macros that hit a similar problem.FuturesUnordered
et al so that the futures are polled by the main executor directly, even when the sub-executor is not being polled. I have no idea how to do this, though.The text was updated successfully, but these errors were encountered: