Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: Depend on backtrace crate from crates.io #60852

Merged
merged 1 commit into from
May 26, 2019

Conversation

alexcrichton
Copy link
Member

This commit removes all in-tree support for generating backtraces in
favor of depending on the backtrace crate on crates.io. This resolves
a very longstanding piece of duplication where the standard library has
long contained the ability to generate a backtrace on panics, but the
code was later extracted and duplicated on crates.io with the
backtrace crate. Since that fork each implementation has seen various
improvements one way or another, but typically backtrace-the-crate has
lagged behind libstd in one way or another.

The goal here is to remove this duplication of a fairly critical piece
of code and ensure that there's only one source of truth for generating
backtraces between the standard library and the crate on crates.io.
Recently I've been working to bring the backtrace crate on crates.io
up to speed with the support in the standard library which includes:

  • Support for StackWalkEx on MSVC to recover inline frames with
    debuginfo.
  • Using libbacktrace by default on MinGW targets.
  • Supporting libbacktrace on OSX as an option.
  • Ensuring all the requisite support in backtrace-the-crate compiles
    with #![no_std].
  • Updating the libbacktrace implementation in backtrace-the-crate to
    initialize the global state with the correct filename where necessary.

After reviewing the code in libstd the backtrace crate should be at
exact feature parity with libstd today. The backtraces generated should
have the same symbols and same number of frames in general, and there's
not known divergence from libstd currently.

Note that one major difference between libstd's backtrace support and
the backtrace crate is that on OSX the crates.io crate enables the
coresymbolication feature by default. This feature, however, uses
private internal APIs that aren't published for OSX. While they provide
more accurate backtraces this isn't appropriate for libstd distributed
as a binary, so libstd's dependency on the backtrace crate explicitly
disables this feature and forces OSX to use libbacktrace as a
symbolication strategy.

The long-term goal of this refactoring is to eventually move us towards
a world where we can drop libbacktrace entirely and simply use Gimli
and the surrounding crates for backtrace support. That's still aways off
but hopefully will much more easily enabled by having the source of
truth for backtraces live in crates.io!

Procedurally if we go forward with this I'd like to transfer the
backtrace-rs crate to the rust-lang GitHub organization as well, but I
figured I'd hold off on that until we get closer to merging.

@rust-highfive

This comment has been minimized.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 15, 2019
@alexcrichton
Copy link
Member Author

r? @sfackler

@sfackler I figured you'd probably be good for taking a look at this and perhaps scraping it in more detail, but if you're busy I'm happy to find a different reviewer.

@rust-highfive

This comment has been minimized.

@rust-highfive

This comment has been minimized.

@rust-highfive

This comment has been minimized.

@rust-highfive

This comment has been minimized.

@bors

This comment has been minimized.

@alexcrichton
Copy link
Member Author

@sfackler would you be up for doing a quick overview of the backtrace crate itself? That's where sort of the real meat of this PR is, especially in all the changes leading up to getting parity with libstd (msvc inlining support, libbacktrace on OSX/MinGW, etc)

@sfackler
Copy link
Member

Sure - should I be looking at the change in the include-in-std branch?

Cargo.toml Outdated Show resolved Hide resolved
@alexcrichton
Copy link
Member Author

@sfackler there's currently only one commit in the branch used here, otherwise taking a look at the master branch should suffice!

@sfackler
Copy link
Member

I took a look through the backtrace crate and have a few minor notes here and there:

I don't think this ever panics:
https://github.com/alexcrichton/backtrace-rs/blob/master/src/symbolize/coresymbolication.rs#L149

Cleanup should be done with a destructor in case the closure panics:
https://github.com/alexcrichton/backtrace-rs/blob/master/src/symbolize/coresymbolication.rs#L249

unix cfg is a superset of target_os = "linux":
https://github.com/alexcrichton/backtrace-rs/blob/master/src/symbolize/mod.rs#L393

ip_lo/ip_hi shenanigans could potentially be replaced by resolving the function information of the address and having more accurate bounds:
https://github.com/alexcrichton/backtrace-rs/blob/master/src/capture.rs#L158

Might be easier to use LoadLibraryA to avoid the [u16] stuff:
https://github.com/alexcrichton/backtrace-rs/blob/master/src/dbghelp.rs#L109

This is intended to force an abort via a double panic? Seems like that should at least be documented, and maybe replaced by an explicit abort:
https://github.com/alexcrichton/backtrace-rs/blob/master/src/lib.rs#L129

@alexcrichton
Copy link
Member Author

Ok thanks for taking a look! I've attempted to address all that in these commits, and to clarify the intention is indeed to abort the process due to the usage of C callbacks and the inability to unwind through them, and I figured that panic! is a lightweight-ish way to guarantee an abort which doesn't involve accidentally picking up more deps or having std/core split code.

@alexcrichton
Copy link
Member Author

Ok I've done the publication to crates.io and everything should be ready for an r+ here, although before doing so I wanted to make sure I called out the differences between today's backtraces and this PR. The only strategy that shows differences is libbacktrace, and today libstd does:

  • First, call backtrace_syminfo for each backtrace address. This consults the dynamic symbol table. Print this.
  • Next, call backtrace_pcinfo to attempt to get file/line information. If anything comes out, just print the file/line number.

After this PR, the difference is that during each invocation fo backtrace_pcinfo we prefer the symbol name yielded to backtrace_pcinfo over the one we found in backtrace_syminfo. We also print out a frame per invocation of backtrace_pcinfo (as that's what inline frames are basically).

Given a program like this:

fn main() {
    panic!();
}

today this yields:

thread 'main' panicked at 'explicit panic', foo.rs:2:5
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:39
   1: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:70
   2: std::panicking::default_hook::{{closure}}
             at src/libstd/sys_common/backtrace.rs:58
             at src/libstd/panicking.rs:200
   3: std::panicking::default_hook
             at src/libstd/panicking.rs:215
   4: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:478
   5: std::panicking::begin_panic
             at /rustc/6c2484dc3c532c052f159264e970278d8b77cdc9/src/libstd/panicking.rs:412
   6: foo::main
             at ./foo.rs:2
   7: std::rt::lang_start::{{closure}}
             at /rustc/6c2484dc3c532c052f159264e970278d8b77cdc9/src/libstd/rt.rs:64
   8: std::panicking::try::do_call
             at src/libstd/rt.rs:49
             at src/libstd/panicking.rs:297
   9: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:87
  10: std::rt::lang_start_internal
             at src/libstd/panicking.rs:276
             at src/libstd/panic.rs:388
             at src/libstd/rt.rs:48
  11: std::rt::lang_start
             at /rustc/6c2484dc3c532c052f159264e970278d8b77cdc9/src/libstd/rt.rs:64
  12: main
  13: __libc_start_main
14: _start

but with this PR we get

thread 'main' panicked at 'explicit panic', foo.rs:2:5
stack backtrace:
   0: trace
             at /home/alex/code/backtrace-rs/src/backtrace/libunwind.rs:97
   1: trace_unsynchronized<closure>
             at /home/alex/code/backtrace-rs/src/backtrace/mod.rs:66
   2: _print
             at src/libstd/sys_common/backtrace.rs:47
   3: print
             at src/libstd/sys_common/backtrace.rs:36
   4: {{closure}}
             at src/libstd/panicking.rs:197
   5: default_hook
             at src/libstd/panicking.rs:211
   6: rust_panic_with_hook
             at src/libstd/panicking.rs:474
   7: std::panicking::begin_panic
             at ./src/libstd/panicking.rs:408
   8: foo::main
             at ./foo.rs:2
   9: std::rt::lang_start::{{closure}}
             at ./src/libstd/rt.rs:64
  10: {{closure}}
             at src/libstd/rt.rs:49
  11: do_call<closure,i32>
             at src/libstd/panicking.rs:293
  12: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:85
  13: try<i32,closure>
             at src/libstd/panicking.rs:272
  14: catch_unwind<closure,i32>
             at src/libstd/panic.rs:388
  15: lang_start_internal
             at src/libstd/rt.rs:48
  16: std::rt::lang_start
             at ./src/libstd/rt.rs:64
  17: main
  18: __libc_start_main
  19: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Take a look at lang_start_internal from the first backtrace. You'll notice the symbol name is long and expanded, and lists three different file/line source locations (as it's some inline frames). In this PR's backtrace though you'll notice that the symbol names are much shorter (sometimes much less descriptive) and also have a symbol per file/line pair, instead of one symbol for a set of file/line info.

Also of note is std::panicking::default_hook::{{closure}} in today's backtrace but this PR prints two frames of default_hook and {{closure}}.

I'm honestly not entirely sure why the debuginfo names for each frame in libstd is so terse. I figure it has something to do with our debuginfo, but foo::main shows up and trait methods in general show up well too. The current strategy in the backtrace crate for favoring debuginfo symbol names comes from rust-lang/backtrace-rs#186.

It may just be that in the long run this places more pressure on us to clean up the libstd backtraces perhaps? I'm not sure how others think about doing that now vs later.

@sfackler
Copy link
Member

That seems pretty bad - is the theory that it's specifically limited to symbols in libstd?

@alexcrichton
Copy link
Member Author

So I'm actually getting pretty confusing behavior and I don't really know why.

If libstd is built with -Cdebuginfo=1, lines in the backtrace look like:

14: do_call<closure,i32>
  at src/libstd/panicking.rs:293

If libstd is built with -Cdebuginfo=2, lines in the backtrace look like:

14: std::panicking::try::do_call
  at src/libstd/panicking.rs:293

however if I compile user code with -Cdebuginfo=1 I get

11: foo::main

(note the lack of filename/line number) and with -C debuginfo=2 I get

11: foo::main
  at ./foo.rs:2

So I seem to have concluded that:

  • std, -Cdebuginfo=1 - less descriptive symbol names with filename/line
  • std, -Cdebuginfo=2 - descriptive symbol names with filename/line
  • user code, -Cdebuginfo=1 - descriptive symbol names without filename/line
  • user code, -Cdebuginfo=2 - descriptive symbol names with filename/line

...

...

and on further digging things just seem to keep getting weirder. I'm just using the latest stable release for these tests in the backtrace crate itself, where there's just a main function that immediately prints a backtrace. In both these tests the backtrace crate itself is compiled with -Cdebuginfo=2.

  • Program compiled with -Cdebuginfo=1 shows trace for symbol name in the backtrace crate
  • Program compiled with -Cdebuginfo=2 shows backtrace::backtrace::libunwind::trace for symbol name in the backtrace crate.

This means that the way my program is compiled apparently affects symbol resolution for upstream crates somehow. I have no idea how though!

@michaelwoerister would you be able to help shed some light here perhaps? This all seems highly related to various bits and pieces of debuginfo that we're emitting. Could you maybe provide a bit of background to what we're specifying in debuginfo as the name of each function? Do you know why there might be such inconsistent results for a library like libbacktrace reading debuginfo to prodduce such a wide variety of names?

@hellow554
Copy link
Contributor

@alexcrichton maybe related #61007?

@bors

This comment has been minimized.

bors added a commit that referenced this pull request May 25, 2019
std: Depend on `backtrace` crate from crates.io

This commit removes all in-tree support for generating backtraces in
favor of depending on the `backtrace` crate on crates.io. This resolves
a very longstanding piece of duplication where the standard library has
long contained the ability to generate a backtrace on panics, but the
code was later extracted and duplicated on crates.io with the
`backtrace` crate. Since that fork each implementation has seen various
improvements one way or another, but typically `backtrace`-the-crate has
lagged behind libstd in one way or another.

The goal here is to remove this duplication of a fairly critical piece
of code and ensure that there's only one source of truth for generating
backtraces between the standard library and the crate on crates.io.
Recently I've been working to bring the `backtrace` crate on crates.io
up to speed with the support in the standard library which includes:

* Support for `StackWalkEx` on MSVC to recover inline frames with
  debuginfo.
* Using `libbacktrace` by default on MinGW targets.
* Supporting `libbacktrace` on OSX as an option.
* Ensuring all the requisite support in `backtrace`-the-crate compiles
  with `#![no_std]`.
* Updating the `libbacktrace` implementation in `backtrace`-the-crate to
  initialize the global state with the correct filename where necessary.

After reviewing the code in libstd the `backtrace` crate should be at
exact feature parity with libstd today. The backtraces generated should
have the same symbols and same number of frames in general, and there's
not known divergence from libstd currently.

Note that one major difference between libstd's backtrace support and
the `backtrace` crate is that on OSX the crates.io crate enables the
`coresymbolication` feature by default. This feature, however, uses
private internal APIs that aren't published for OSX. While they provide
more accurate backtraces this isn't appropriate for libstd distributed
as a binary, so libstd's dependency on the `backtrace` crate explicitly
disables this feature and forces OSX to use `libbacktrace` as a
symbolication strategy.

The long-term goal of this refactoring is to eventually move us towards
a world where we can drop `libbacktrace` entirely and simply use Gimli
and the surrounding crates for backtrace support. That's still aways off
but hopefully will much more easily enabled by having the source of
truth for backtraces live in crates.io!

Procedurally if we go forward with this I'd like to transfer the
`backtrace-rs` crate to the rust-lang GitHub organization as well, but I
figured I'd hold off on that until we get closer to merging.
@bors

This comment has been minimized.

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 25, 2019
This commit removes all in-tree support for generating backtraces in
favor of depending on the `backtrace` crate on crates.io. This resolves
a very longstanding piece of duplication where the standard library has
long contained the ability to generate a backtrace on panics, but the
code was later extracted and duplicated on crates.io with the
`backtrace` crate. Since that fork each implementation has seen various
improvements one way or another, but typically `backtrace`-the-crate has
lagged behind libstd in one way or another.

The goal here is to remove this duplication of a fairly critical piece
of code and ensure that there's only one source of truth for generating
backtraces between the standard library and the crate on crates.io.
Recently I've been working to bring the `backtrace` crate on crates.io
up to speed with the support in the standard library which includes:

* Support for `StackWalkEx` on MSVC to recover inline frames with
  debuginfo.
* Using `libbacktrace` by default on MinGW targets.
* Supporting `libbacktrace` on OSX as an option.
* Ensuring all the requisite support in `backtrace`-the-crate compiles
  with `#![no_std]`.
* Updating the `libbacktrace` implementation in `backtrace`-the-crate to
  initialize the global state with the correct filename where necessary.

After reviewing the code in libstd the `backtrace` crate should be at
exact feature parity with libstd today. The backtraces generated should
have the same symbols and same number of frames in general, and there's
not known divergence from libstd currently.

Note that one major difference between libstd's backtrace support and
the `backtrace` crate is that on OSX the crates.io crate enables the
`coresymbolication` feature by default. This feature, however, uses
private internal APIs that aren't published for OSX. While they provide
more accurate backtraces this isn't appropriate for libstd distributed
as a binary, so libstd's dependency on the `backtrace` crate explicitly
disables this feature and forces OSX to use `libbacktrace` as a
symbolication strategy.

The long-term goal of this refactoring is to eventually move us towards
a world where we can drop `libbacktrace` entirely and simply use Gimli
and the surrounding crates for backtrace support. That's still aways off
but hopefully will much more easily enabled by having the source of
truth for backtraces live in crates.io!

Procedurally if we go forward with this I'd like to transfer the
`backtrace-rs` crate to the rust-lang GitHub organization as well, but I
figured I'd hold off on that until we get closer to merging.
@alexcrichton
Copy link
Member Author

@bors: r=sfackler

@bors
Copy link
Contributor

bors commented May 26, 2019

📌 Commit d1040fe has been approved by sfackler

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 26, 2019
@bors
Copy link
Contributor

bors commented May 26, 2019

⌛ Testing commit d1040fe with merge 572892c...

bors added a commit that referenced this pull request May 26, 2019
std: Depend on `backtrace` crate from crates.io

This commit removes all in-tree support for generating backtraces in
favor of depending on the `backtrace` crate on crates.io. This resolves
a very longstanding piece of duplication where the standard library has
long contained the ability to generate a backtrace on panics, but the
code was later extracted and duplicated on crates.io with the
`backtrace` crate. Since that fork each implementation has seen various
improvements one way or another, but typically `backtrace`-the-crate has
lagged behind libstd in one way or another.

The goal here is to remove this duplication of a fairly critical piece
of code and ensure that there's only one source of truth for generating
backtraces between the standard library and the crate on crates.io.
Recently I've been working to bring the `backtrace` crate on crates.io
up to speed with the support in the standard library which includes:

* Support for `StackWalkEx` on MSVC to recover inline frames with
  debuginfo.
* Using `libbacktrace` by default on MinGW targets.
* Supporting `libbacktrace` on OSX as an option.
* Ensuring all the requisite support in `backtrace`-the-crate compiles
  with `#![no_std]`.
* Updating the `libbacktrace` implementation in `backtrace`-the-crate to
  initialize the global state with the correct filename where necessary.

After reviewing the code in libstd the `backtrace` crate should be at
exact feature parity with libstd today. The backtraces generated should
have the same symbols and same number of frames in general, and there's
not known divergence from libstd currently.

Note that one major difference between libstd's backtrace support and
the `backtrace` crate is that on OSX the crates.io crate enables the
`coresymbolication` feature by default. This feature, however, uses
private internal APIs that aren't published for OSX. While they provide
more accurate backtraces this isn't appropriate for libstd distributed
as a binary, so libstd's dependency on the `backtrace` crate explicitly
disables this feature and forces OSX to use `libbacktrace` as a
symbolication strategy.

The long-term goal of this refactoring is to eventually move us towards
a world where we can drop `libbacktrace` entirely and simply use Gimli
and the surrounding crates for backtrace support. That's still aways off
but hopefully will much more easily enabled by having the source of
truth for backtraces live in crates.io!

Procedurally if we go forward with this I'd like to transfer the
`backtrace-rs` crate to the rust-lang GitHub organization as well, but I
figured I'd hold off on that until we get closer to merging.
@bors
Copy link
Contributor

bors commented May 26, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: sfackler
Pushing 572892c to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 26, 2019
@bors bors merged commit d1040fe into rust-lang:master May 26, 2019
@alexcrichton alexcrichton deleted the std-backtrace branch May 28, 2019 20:37
@alexcrichton
Copy link
Member Author

alexcrichton commented May 28, 2019

I've now transferred the repository to live in rust-lang

@estebank
Copy link
Contributor

estebank commented Jun 1, 2019

@alexcrichton could this change have caused backtraces under macosx to refer to each scope only as unknown with no further information?

@alexcrichton
Copy link
Member Author

Yep that's probably this! Mind opening an issue and I can debug next week?

@estebank
Copy link
Contributor

estebank commented Jun 1, 2019

Filed #61416. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.