-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
materialized: attempt to produce nice backtraces on stack overflow #5549
Conversation
// squabble over a few megabytes. | ||
const STACK_SIZE: usize = 2 << 20; | ||
|
||
// x86_64 requires 16-byte alignment. Hopefully other platforms don't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arm64 requires 16-byte as well AFAICT from a quick googling; you might want to mention that because my first thought here was "will this break mz-on-arm?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, will do on the next pass through CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this one up too!
// Allocate a stack. | ||
let buf_layout = | ||
Layout::from_size_align(STACK_SIZE, STACK_ALIGN).expect("layout known to be valid"); | ||
let buf = unsafe { alloc::alloc(buf_layout) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New coding standard I just invented: you must leave a comment about why any usage of unsafe
is sound. (Feel free to push back if you don't think it's a good rule, but seems reasonable to me).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I do try to follow that in general—I'm just not sure what it would say in this function. Like, there aren't invariants that must be upheld. We're just calling unsafe C functions. I can add comments like // SAFETY: calling a C function in the manner it which it is documented to be called
but that doesn't seem very useful. Happy to take suggestions on what would be more valuable here though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alloc::alloc
does in fact have documented requirements: "SAFETY: See GlobalAlloc::alloc."
Which points to...
This function is unsafe because undefined behavior can result if the caller does not ensure that layout has non-zero size.
(Extension subtraits might provide more specific bounds on behavior, e.g., guarantee a sentinel address or a null pointer in response to a zero-size allocation request.)
The allocated block of memory may or may not be initialized.
So I think you can just mention that (1) you have ensured that the size is non-zero, and (2) that nobody is reading the uninitialized value (or at any rate, not doing so before it gets passed to C World and outside the purview of Rust's safety guarantees anyway).
Given your response, I think you might have instead been thinking of your later uses of unsafe
, in the calls to sigaltstack
and sigaction
. For these use your judgment, you could write down your understanding of when POSIX/linux guarantees that calling these functions is allowed, or just a link to man pages, or nothing if you really think there's not anything useful to be said.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, took another stab at it.
4995236
to
4576289
Compare
// Chosen to match the default Rust thread stack size of 2MiB. Probably | ||
// overkill, but we'd much rather have backtraces on stack overflow than | ||
// squabble over a few megabytes. | ||
const STACK_SIZE: usize = 2 << 20; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to use libc::SIGSTKSZ
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah, that's a good point you bring up. The thing is that libc::SIGSTKSZ
makes me a little nervous because I'm not sure that Backtrace::backtrace
is equipped to run on an 8KiB stack. I mean, it should be, because a few stack frames should not take up 8KIB... but I figured it'd be better to be safe than sorry. Don't want some crazy change to how gimli works down the road causing a stack overflow while we're trying to handle a stack overflow.
Happy to be convinced otherwise, but for now I just added a comment that briefly explains this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! Left one minor comment - there's an existing constant for sigstack sizes you may want to use.
Rust produces bad error messages on stack overflow, like "thread 'foo' has overflowed its stack" which provides very little insight into where the recursion that caused the stack to overflow occurred. See rust-lang/rust#51405 for details. This commit adds a SIGSEGV handler that attempts to print a backtrace, following the approach in the backtrace-on-stack-overflow crate. I copied the code from that crate into Materialize and tweaked it because it's a very small amount of code that we'll likely need to modify, and I wanted to improve its error handling. In my manual testing this produces a nice backtrace when Materialize overflows its stack.
4576289
to
f11f749
Compare
Rust produces bad error messages on stack overflow, like
which provides very little insight into where the recursion that caused
the stack to overflow occurred.
See rust-lang/rust#51405 for details.
This commit adds a SIGSEGV handler that attempts to print a backtrace,
following the approach in the backtrace-on-stack-overflow crate. I
copied the code from that crate into Materialize and tweaked it because
it's a very small amount of code that we'll likely need to modify, and I
wanted to improve its error handling.
In my manual testing this produces a nice backtrace when Materialize
overflows its stack.
This change is