-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack space in async closures is not shared between branches in debug builds #72247
Comments
Is this really limited to async code? |
I don't know. There's not really any equivalent to f2 in non-async code, so it's not as easy to measure. |
From a quick test on a non-async function with a bunch of trivial branches the stack space is constant regardless of how many branches I add. |
I believe the problem here is panic handling. In This means any MIR of the form
becomes
So now every storage location that was live across something that could theoretically panic is now live at the same time on the poisoned block, which means none of them can share stack space. Thus the size of the stack blows out enormously. This is also consistent with my observation that adding a match branch that calls no functions but merely returns a constant does not increase the stack size. So it seems to me that, in essence, #61373 has been regressed here. Maybe @tmandry has some idea what's going on? |
And, to be clear, I simplified this example from some async code with a 100KB stack frame, so this is (at least a part of) a rather serious problem. |
All generators in the posted code have a state size of 2 fwiw |
Sorry if I wasn't clear: the values here are not live across yield points so they don't take up space in the generator itself, just on the C stack. |
@khuey your analysis makes sense to me – the locals are all considered StorageLive when they hit the cleanup block, and that means they must all have stack space reserved. LLVM is smart enough to optimize the stack space allocation in release, but apparently doesn't do such an analysis in debug builds. cc #68622, here's a case where using For now, I think we can revert the "collapsing" of cleanup blocks that #69814 did. @jonas-schievink, are there reasons not to besides possible compile-time slowness? |
All cleanup edges need to poison the generator for soundness, that was the intention behind that change. |
Oh I see, sorry, I glossed over that. I guess we can add the poison statement to every block that terminates in |
So, I did some experimenting here. It turns out that this was pre-existing before #61373. Even if we emit storage markers to LLVM, it doesn't seem to make use of them when optimizations are disabled. I have a work-in-progress branch that does this. Is there some attribute we can emit on the resume function that will cause LLVM to optimize storage of that function? |
Tagging @nikic who might know? |
I think this has been fixed at some point. I don't see this on the test case I provided anymore. |
Consider:
If you build this (on 2020-05-14 a74d186) you'll see that the stack frames for f1::{{closure}} and f3::{{closure}} are bigger than those for the f2::{{closure}s. If you try to add or remove a branch you'll also see that the size of f1::{{closure}} and f3::{{closure}} stacks vary accordingly.
This is frustrating because it means that branch heavy code blows out the stack much faster in a debug build than it does in an opt build when recursing. I saw an order of magnitude difference in stack size in one of my functions between debug and opt because of this.
The text was updated successfully, but these errors were encountered: