-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coop: allow disabling cooperative scheduling #3516
Conversation
With deeply nested FuturesUnordered where most leaf futures are ready, yields forced by the cooperative scheduling mechanism result in a lot of useless polls when Tokio attempts to force a return to the runtime. Indeed, it takes 32 useless polls to exit a single layer of FuturesUnordered, so with e.g. 5 layers, it takes 32^5 polls (it doesn't matter how many futures are in the FuturesUnordered instance since it'll poll 32 times, even if it's the same future being polled 32 times). This has to be done # total futures / 128 times, which can be a prohibitively high amount of polling. In our services, we've seen this result in ~50% of our CPU time being spent in `wake_by_ref()` called by `poll_proceed`. Conversely, we've not actually seen benefits from using cooperative scheduling, so turning it off seems like a natural solution. This commit adds support for doing this in the runtime builder. It targets the 2.x branch since that is the one we're currently using, but I'm happy to port this to 1.x.
/// Disable tokio's automated cooperative scheduling mechanism See this link for details: | ||
/// <https://tokio.rs/blog/2020-04-preemption> | ||
pub fn no_coop(&mut self) -> &mut Self { | ||
self.coop_budget = Budget::unconstrained(); | ||
self | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only had a quick look. Is this the only change in the public API?
In any case, it needs better docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is this and a similar change in LocalSet. I agree this needs more docs — I'm happy to add the polish here as soon as we agree on the general direction :)
Any guidance on the kind of documentation this warrants? The underlying feature didn't have that much I could link to, but let me know what you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, making sure we agree on the general direction first is probably good.
Regarding the kind of doc, it could look something like this:
Disable the automated cooperative scheduling feature.
The coop feature limits the number of operations that can be performed before a task yields to the executor. You can read more about the coop feature in the blog post Reducing tail latencies with automatic cooperative task yielding.
Example
use tokio::runtime; let rt = runtime::Builder::new_multi_thread() .enable_all() .disable_coop() .build() .unwrap();
I'll take a look at the failed CI run. I think the field isn't needed in all cfg configurations. |
I just skimmed this. I don't think I want to expose budgets / coop at the runtime level. It seems that the core issue is An alternate strategy may be providing a mod task {
async fn unconstrained<T>(f: impl Future<Output=T>) -> T { ... }
}
unconstrained(async {
// do things that are not compatible w/ Tokio's budgeting
}).await; Thoughts @tokio-rs/maintainers? |
I think @taiki-e's patch in rust-lang/futures-rs#2333 will also help a decent amount here, since it'll eliminate the 32-factor in favor of using the actual number of futures. I personally feel like the "right" way to go about this is to fix up coop so that sub-executors don't cause this problem in the first place, such as by implementing the suggestion I had in the last paragraph of #3493 (comment). That will mitigate the impact most consumers feel today from coop without exposing any external knobs. That said, having some escape hatch from coop is probably worthwhile, simply because we're unlikely to fix all issues with coop on any kind of known schedule, and people need to have running code in the meantime. I'm not sure how that should be exposed though, given that coop isn't currently exposed in the public API of tokio at all (as @carllerche is getting at). Part of the reason for that is that coop is still not something tokio is committed to long-term, and adding a method for "opting out" of it would commit us to it. Or at least commit to having it documented in the API long-term (we could always deprecate and document as "this option does nothing"). |
@jonhoo the API I propose doesn't necessarily constrain us to maintaining the budgeting behavior. If we remove it, the fn becomes a no-op. If we alter it in any way, the API ensures opt-out. It is relatively safe. Either way, we have to maintain some level of budgeting in 1.0 otherwise it would be a breaking change. |
Thanks for taking a look everyone!
Indeed — for the time being, we can't use newer versions of Tokio due to this, and we're having to run a forked version instead using this patch, which isn't ideal :(
I think that would be a good solution for egregious cases where we can't start our service at all, yes. I do however worry about missing degenerate edge cases or smaller performance issues:
I have to admit solutions that require changing the ecosystem around Tokio make me a little uneasy. We still have legacy code that's floating here or there and using 0.1 futures (this bit us before), and I'm guessing that won't receive this fix :( Now, to be clear, I do realize it's not on Tokio to maintain the whole ecosystem (or to support older versions of libraries indefinitely). However, I do think the current design for coop tends to require quite a few "ecosystem-level" fixes. In fact, that's largely where the temptation to just turn it off came from:
Now, that's where I was coming form, but it sounds like you're saying that the current design might eventually change. This is (welcome!) news to me — I hadn't seen that most recent issue you linked, most of the conversations I had or read around coop date to at least half a year ago :) Given that, I'm perfectly happy with giving coop another shot and opting out the 1 bit of code I currently know is incompatible with it via an Accordingly, I'm happy to send a PR to add |
The situation here is a little tricky. What's happening is that I'm very sympathetic to the problem — enabling coop causes your code to perform poorly, so therefore it should be turned off to mitigate the problem. At the same time, disabling coop comes with own problems — specifically, it allows starvation again, which might manifest in other (harder to debug) performance problems, especially under contention. But as you say, fixing the ecosystem takes time, and may even not happen for legacy code. So what to do? I like the |
I'm okay with backport the patch to 0.1. (Unlike before, I have permission to merge/publish, so you don't have to wait six months after the patches are filed. I can merge & release them soon:) UPDATE: Published in 0.1.31 |
This adds an opt-out for code that might not be compatible with Tokio coop (e.g. heavily nested `FuturesUnordered`). Notes to reviewer: - I added doc tests for this, relying on the fact that `with_budget` already has a bunch of test coverage. Let me know if that's not sufficient. - I removed a few docs in the coop module that reference methods that don't exist yet since this PR makes that module public. See the discussion in: tokio-rs#3516
This adds an opt-out for code that might not be compatible with Tokio coop (e.g. heavily nested `FuturesUnordered`). Notes to reviewer: - I made the coop module public for this. - I removed a few docs in said coop module that reference methods that don't exist yet. For more context, see the discussion in: tokio-rs#3516
This adds an opt-out for code that might not be compatible with Tokio coop (e.g. heavily nested `FuturesUnordered`). Notes to reviewer: - I made the coop module public for this. - I removed a few docs in said coop module that reference methods that don't exist yet. For more context, t ssee the discussion in: tokio-rs#3516
This adds an opt-out for code that might not be compatible with Tokio coop (e.g. heavily nested `FuturesUnordered`). Notes to reviewer: - I made the coop module public for this. - I removed a few docs in said coop module that reference methods that don't exist yet. For more context, see the discussion in: tokio-rs#3516
Thanks! I sent that backport PR and I see you merged it & published it.
This sounds reasonable, and it offers a way to unblock us :) I sent #3547 for this.
I have to admit that I just don't think we've really seen or expected that much benefit from coop. I understand the argument that starvation could theoretically happen, and I can totally imagine that for some use cases it could be an improvement. Unfortunately, for us so far, coop really has been the cause of problems — not their solution. I think a big part of this is that we aren't that latency sensitive (i.e. we generally care a lot more about throughput), and that we usually ensure our services run with enough headroom that we just don't commonly expect all workers to be busy. Again, I totally understand that others could have a different experience. I just wanted to clarify that at least from where we stand today, wanting to turn off coop isn't a rash decision. It really is what we think is the best option at this time (though at least as far as I'm concerned, my perspective would change if there was buy-in from the ecosystem such that things like FuturesUnordered could be explicitly aware of Tokio's desire to preempt tasks, instead of having to approximate it from seeing all their children yield). That said, I can totally understand Tokio not wanting to add support for globally disabling this, and that's fair. In the short term, I do however want to make sure we're not having to stay on a forked version of Tokio, so I'm happy to go with an |
I think this maybe gets at the heart of it. With that kind of arrangement, you generally won't run into problems, though I still don't think you're totally immune. For example (correct me if I'm wrong and this has changed since @carllerche), tokio has an optimization where workers have a "slot" for a future that is not what is currently running and that cannot be stolen. If a future sits in that slot behind a rarely-yielding future, then that future may potentially take a very long time to run next. It's true that this primarily affects latency, but it can introduce large latency spikes in your p99/p999 even if you have plenty of threads/hardware resources. My biggest concern with adding a coop opt-out is that it's very easy to reach for to "fix" problems, only to then find other performance problems like ^ which then become very hard to debug. But perhaps my worry is unwarranted and we can document this risk well enough on that disabling method. Perhaps we could also give it a name like |
That's fair. To be clear, I'm not saying we never would want to use coop, just that for the time being it's primarily caused us pain. We're definitely supportive of the concept, and task starvation is something we are broadly concerned about (I happened to chat with @carllerche earlier this week about this and noted that being able to monitor executor latency & starvation is our top # wish from Tokio, which IMO would help motivate / inform those kinds of decisions). I do want to reiterate that in the long term, we want to be running Tokio's default configuration if we can. Doing so right now isn't practical for the reasons I mentioned above, but whether we go with a runtime-level flag to opt out, or an For us, I think it really comes down to whether we should expect the mechanism to evolve over time: On the one hand, we would not want to use the mechanism as it exists right now because we'd be concerned about code that doesn't work well with it, either because that code is slow but not so slow that we see it, or because it is noticeably slow and we end up spending serious time debugging it. In this case, we really would want a runtime-level opt-out. On the other hand, if Tokio eventually provides a way to efficiently detect those problems (I would guess count of forced-yields before returning to runtime would be one), or if the design evolves in a way that makes them less likely or eliminates them (like your suggestion to wake things up only once control has returned to the runtime, or by upstreaming the notion of preemption into As a quick aside, adding a runtime-level flag largely amounts to making the current However, unfortunately for us, we can't do that... Given how we pull OSS crates at $work, we can't remove the feature flag without opting out the entire company, and that's no something I want to do (other folks in the company might have a totally different use case, and at the end of the day it's not up to me what the default Tokio experience should be!) |
In some sense, I'm not really worried about you doing the right thing in your code base. I'm worried about the next user of tokio that comes along, finds some problem that they think is due to coop, throw in a
I think that's technically true, but yeah, as you observe, in practice it is very difficult to remove a feature since cargo is very liberal about adding them back in 😅 |
I guess now that #3547 is merged, we should just close this? |
I think that's reasonable. We could add a |
Okay. I am closing this for now. If you want to discuss adding this feature further, please go ahead and open a new issue. |
With deeply nested FuturesUnordered where most leaf futures are ready,
yields forced by the cooperative scheduling mechanism result in a lot of
useless polls when Tokio attempts to force a return to the runtime.
Indeed, it takes 32 useless polls to exit a single layer of
FuturesUnordered, so with e.g. 5 layers, it takes 32^5 polls (it doesn't
matter how many futures are in the FuturesUnordered instance since it'll
poll 32 times, even if it's the same future being polled 32 times). This
has to be done # total futures / 128 times, which can be a prohibitively
high amount of polling.
In our services, we've seen this result in ~50% of our CPU time being
spent in
wake_by_ref()
called bypoll_proceed
. Conversely, we've notactually seen benefits from using cooperative scheduling, so turning it
off seems like a natural solution (in earlier conversations on other
issues related to cooperative scheduling, it was mentioned that the
Tokio team felt making it configurable could be good).
This commit adds support for doing this in the runtime builder. It
targets the 2.x branch since that is the one we're currently using, but
I'm happy to port this to 1.x.