Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoped thread panic messages unexpectedly interleaving in stderr output #96706

Open
AronParker opened this issue May 4, 2022 · 6 comments · May be fixed by #122565
Open

Scoped thread panic messages unexpectedly interleaving in stderr output #96706

AronParker opened this issue May 4, 2022 · 6 comments · May be fixed by #122565
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` A-panic Area: Panicking machinery A-thread Area: `std::thread` C-bug Category: This is a bug. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@AronParker
Copy link
Contributor

I tried this code:

#![feature(scoped_threads)]

use std::{panic, thread};

fn main() {
    thread::scope(|s| {
        s.spawn(|| {
            panic!("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA");
        });

        s.spawn(|| {
            panic!("BBBBBBBBBBBBBBBBBBBBBBBBBBBBB");
        });
    });
}

I expected to see the following output:

thread '<unnamed>' panicked at 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', src\main.rs:8:13
thread '<unnamed>' panicked at 'BBBBBBBBBBBBBBBBBBBBBBBBBBBBB', src\main.rs:12:13

As it runs concurrently, of course the order could be reversed.

Instead, the panic messages were interleaved. Here are some examples:

thread 'thread '<unnamed>' panicked at 'BBBBBBBBBBBBBBBBBBBBBBBBBBBBB<unnamed>', ' panicked at 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', src\main.rs:13:13src\main.rs:8:13

thread '<unnamed>' panicked at 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', src\main.rs:thread '<unnamed>' panicked at '8:13BBBBBBBBBBBBBBBBBBBBBBBBBBBBB

thread '<unnamed>' panicked at 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', thread '<unnamed>src\main.rs:8' panicked at 'BBBBBBBBBBBBBBBBBBBBBBBBBBBBB:13

thread '<unnamed>' panicked at 'thread '<unnamed>' panicked at 'BBBBBBBBBBBBBBBBBBBBBBBBBBBBB', AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', src\main.rs:8:13

thread 'thread '<unnamed><unnamed>' panicked at '' panicked at 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBBB', ', src\main.rssrc\main.rs::812::1313

As scoped threads panic directly to stderr, I see no measure to prevent this and provide more readable output.

Meta

rustc --version --verbose:

rustc 1.62.0-nightly (7c4b47696 2022-04-30)
binary: rustc
commit-hash: 7c4b47696907d64eff5621a64eb3c6e795a9ec77
commit-date: 2022-04-30
host: x86_64-pc-windows-msvc
release: 1.62.0-nightly
LLVM version: 14.0.1
Backtrace

thread '<unnamed>' panicked at 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', src\main.rs:8:13
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/7c4b47696907d64eff5621a64eb3c6e795a9ec77/library\std\src\panicking.rs:584
   1: core::panicking::panic_fmt
             at /rustc/7c4b47696907d64eff5621a64eb3c6e795a9ec77/library\core\src\panicking.rs:142
   2: hello_world::main::closure$0::closure$0
             at .\src\main.rs:8
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread 'main' panicked at 'a scoped thread panicked', src\main.rs:6:5
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/7c4b47696907d64eff5621a64eb3c6e795a9ec77/library\std\src\panicking.rs:584
   1: core::panicking::panic_fmt
             at /rustc/7c4b47696907d64eff5621a64eb3c6e795a9ec77/library\core\src\panicking.rs:142
   2: std::thread::scoped::scope<hello_world::main::closure_env$0,tuple$<> >
             at /rustc/7c4b47696907d64eff5621a64eb3c6e795a9ec77\library\std\src\thread\scoped.rs:155
   3: hello_world::main
             at .\src\main.rs:6
   4: core::ops::function::FnOnce::call_once<void (*)(),tuple$<> >
             at /rustc/7c4b47696907d64eff5621a64eb3c6e795a9ec77\library\core\src\ops\function.rs:248
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
error: process didn't exit successfully: `target\debug\hello_world.exe` (exit code: 101)

@AronParker AronParker added the C-bug Category: This is a bug. label May 4, 2022
@bjorn3
Copy link
Member

bjorn3 commented May 4, 2022

This also happens for non-scoped threads. Panic messages are not buffered but written concurrently without any locking piece by piece, so if two threads panic at the same time their backtraces will interleave. The only way around this would be to either buffer the panic message (which may use a lot of memory and could cause the whole panic message to get lost in case an error occurs during panic formatting) or locking stderr (which may cause deadlocks) I agree it is not ideal.

@AronParker
Copy link
Contributor Author

That's interesting, thanks for providing that information, I didn't know that. Can't we only allow the main thread to write panic messages and capture the panic messages of other threads in that Box<dyn Any + Send>, which can then be printed on demand with panic::resume_unwind?

@bjorn3
Copy link
Member

bjorn3 commented May 4, 2022

You can use a custom panic hook using std::panic::set_hook if you really need to suppress panic messages. But note that panics should not be used for regular errors, only for errors or bugs where the only reasonable way to handle them is to give up on what you are currently doing and depending on the kind of process attempt to serve the next request or quit entirely. If you are getting panics often enough that the interleaving happens, you are probably doing something wrong. Also note that it is possible to force all panics to abort the current process. This is often done to get slightly better performance or somewhat smaller binaries. See https://doc.rust-lang.org/book/ch09-03-to-panic-or-not-to-panic.html for when to panic and when to use Result instead.

@fmease fmease added T-libs Relevant to the library team, which will review and decide on the PR/issue. A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` A-panic Area: Panicking machinery A-thread Area: `std::thread` and removed needs-triage-legacy labels Jan 26, 2024
@benesch
Copy link
Contributor

benesch commented Aug 29, 2024

If you are getting panics often enough that the interleaving happens, you are probably doing something wrong.

If you are running a concurrent Rust program at any significant scale, as we are at @MaterializeInc, interleaved panic output seems inevitable.

I agree that panic messages interleaving with one another is a rare case, but panic messages interleaving with other output produced by the program seems somewhat inevitable. Many production services produce near-constant log output (@MaterializeInc certainly does, and especially so with debug logging enabled), and though panics are rare, when they do strike, it is a near guarantee that the panic message output will interleave badly with log output. Details in MaterializeInc/database-issues#7221.

We've looked into some fixes on our side, but so far it seems like the simplest fix would be for the Rust panic handler to buffer the pieces of the panic message internally and then emit them in one single call to write.

@benesch
Copy link
Contributor

benesch commented Aug 29, 2024

Digging a bit deeper, and it seems like a contributing factor is the fact that stderr is not line buffered: #64413

@benesch
Copy link
Contributor

benesch commented Aug 31, 2024

If anyone else runs into this and is looking for a fix, MaterializeInc/materialize#29325 contains what I believe is a panic handler, usable entirely from stable Rust, that will avoid the interleaving issue in most cases. It reconstructs the standard panic handler messages, attempts to print to stderr under the lock for 1s, and then falls back to printing to stderr not under the lock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` A-panic Area: Panicking machinery A-thread Area: `std::thread` C-bug Category: This is a bug. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants