-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix large futures causing stack overflows #10033
Conversation
I looked back through the project history, and the SessionContext was previously wrapped in either an I think this change fixed a lot of issues with locking and was a good change in general. However, moving the final context off of the stack with Box or similar would be good to look into. |
@tustvold could you please weigh in on this? Personally I think going back to If we decide on a direction I'd be willing to contribute the change. Don't want us having to skip another major release because of this issue. |
Box makes sense to me, using Arc will run into issues with mutation |
a8f22dd
to
249e996
Compare
@andygrove @tustvold this PR is ready now - would really appreciate your reviews. Please read the updated PR description - I found and fixed another big stack space offender, more than halving the stack usage. |
Thank you for working on this @sergiimk. I looked through the changes and they all look good to me. If this issue is blocking your ability to use the 37.0.0 release, you could contribute this fix to the 37.1.0 release which @alamb is working on and has indicated will be released at the end of this week. See #9904 |
Thanks for the review! @alamb if you think this fix is suitable for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @sergiimk -- this makes sense to me. Since it has no public API changes, I think it would be a good candidate for backporting in #9904 and if you prepared a backport PR I think we could include it in 37.1.0.
Thanks also @devinjdangelo for the review and the ping
249e996
to
0a2b633
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @sergiimk
Box::pin(async move { self.create_external_table(&cmd).await }) | ||
as std::pin::Pin<Box<dyn futures::Future<Output = _> + Send>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sergiimk I have to admit that I don't fully comprehend the changes here. Do you have any recommendations for resources for learning more about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest I didn't have time to generate disassembly to confirm exactly what Rust+LLVM is doing, so take my conclusions with a grain of salt.
With large_futures
lint enabled you unfortunately get warnings for top-level await
s, not inner futures where the problem starts. So if you try the lint before this fix you'll see thousands of warnings...
I started navigating the calls up the chain to see the similarities between them, and this lead me to this function. As soon as I Box
ed execute_logical_plan
call - all warnings disappeared.
Coming from languages like C/C++ my understanding is that a function will typically increment stack pointer to fit all of its local variables (although afaik this is not strictly specified and left up to compiler). My intuition thus was that every await
in this function creates a local variable to store Future's state and this makes stack frame really big.
I started Box
ing individual calls and saw the size of futures in lint errors progressively go down - this seemed consistent with my theory.
After that I though that instead of boxing and awaiting individual futures I could make different branches return a Box<dyn Future>
and await for in in one place. Less verbose code at the cost of dynamic dispatch.
Box::pin
is of course necessary because tokio cannot tolerate Future state to be moved between awaits.
And async move { self.create_external_table(&cmd).await }
is an unfortunate ugliness to make cmd
live as long as the future. This is not necessary anywhere else because other calls take elements of the plan by value, only create_external_table
takes a reference (and I didn't want to modify public API).
I did second-guess this change a lot ... as I expected Rust to allocate enough stack space not for all futures, but for the biggest future out of all, as only one of them will actually be called. But then I don't know why would memory go down progressively with every future that I boxed.
So based purely on clippy stats this change did help a lot... but I'd be glad to be proven incorrect if my conclusions were wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did second-guess this change a lot ... as I expected Rust to allocate enough stack space not for all futures, but for the biggest future out of all, as only one of them will actually be called. But then I don't know why would memory go down progressively with every future that I boxed.
What I have seen rust do (in debug builds only) is allocate stack space for each local variable in the function. I speculate that this is to make debugging easier as each variable has a unique space in the stack and won't get over written with values from other variables depending on where it is.
When I have worked with C/C++ in the past (gcc mostly) the slots on the stack frame are reused among local variables which makes debugging chalening (as sometimes several variables in the debugger look like they change even when only one is "live" at any point
Thanks again @sergiimk |
Which issue does this PR close?
Closes #9893.
Rationale for this change
Datafusion since v37 started crashing with stack overflows. Based on issues linked to the one above - this problem existed previously but was dealt with by increasing the default stack size (which is not something our project can do).
The intent of the PR is:
What changes are included in this PR?
1] Enables
clippy::pedantic::large_futures
lint2] Traces the large allocations to
DataFrame::session_state
and replace it withBox<SessionState>
.size_of<SessionState>: 1648
size_of<DataFrame>: 2064
size_of<DataFrame>: 424
3] Traces another big offender to
SessionContext::execute_logical_plan
that demultiplexes calls to a dozen of async futures, allocating the space for all of their states within function's stack frame.These changes combined reduce the size of futures from 15-20KiB to ~6KiB.
Are these changes tested?
All is covered by existing tests.
Are there any user-facing changes?
PR does not change any public APIs.