Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide better debugging facilities for nondeterministic tests #184

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

bdonlan
Copy link
Contributor

@bdonlan bdonlan commented Oct 23, 2020

This change adds a new tracing facility that helps track down the source of
nondeterministic execution. It records, for each branch point, the original
(user) source code position leading to this branch, the operation being
performed, and the reference that is involved in the branch. If this does not
match on a later execution, we dump the mismatched operation, as well as a
trace of recent operations, let the current execution run to completion, then
panic.

A typical trace looks a bit like this:

===== NONDETERMINISTIC EXECUTION DETECTED =====
Previous execution:
    13: [tests/nondet.rs:55:28] loom::thread::JoinHandle<()>::join on Thread::2

Current execution:
    13: [tests/nondet.rs:55:28] loom::thread::JoinHandle<()>::join on Thread::3

Recent events:
       0: [tests/nondet.rs:17:26] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
       1: [tests/nondet.rs:18:19] loom::sync::arc::Arc<loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>>::clone on State::1
       2: [tests/nondet.rs:35:9] loom::thread::yield_now
     ...: THREAD SWITCH on Thread::1
       3: [tests/nondet.rs:21:30] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
     ...: THREAD SWITCH on Thread::0
     ...: loom::sync::mutex::MutexGuard<nondet::nondeterministic_execution_detected::State>::drop on Mutex::0
       4: [tests/nondet.rs:17:26] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
       5: [tests/nondet.rs:18:19] loom::sync::arc::Arc<loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>>::clone on State::1
       6: [tests/nondet.rs:35:9] loom::thread::yield_now
     ...: THREAD SWITCH on Thread::2
       7: [tests/nondet.rs:21:30] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
     ...: THREAD SWITCH on Thread::0
     ...: loom::sync::mutex::MutexGuard<nondet::nondeterministic_execution_detected::State>::drop on Mutex::0
       8: [tests/nondet.rs:17:26] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
       9: [tests/nondet.rs:18:19] loom::sync::arc::Arc<loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>>::clone on State::1
      10: [tests/nondet.rs:35:9] loom::thread::yield_now
     ...: THREAD SWITCH on Thread::3
      11: [tests/nondet.rs:21:30] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
     ...: THREAD SWITCH on Thread::0
     ...: loom::sync::mutex::MutexGuard<nondet::nondeterministic_execution_detected::State>::drop on Mutex::0
      12: [tests/nondet.rs:45:30] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
     ...: loom::sync::mutex::MutexGuard<nondet::nondeterministic_execution_detected::State>::drop on Mutex::0

thread 'nondeterministic_execution_detected' panicked at 'Aborting due to non-deterministic execution', src/rt/execution.rs:262:13
stack backtrace:
   0: std::panicking::begin_panic
             at /local/home/bdonlan/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:497
   1: loom::rt::execution::Execution::check_consistency
             at ./src/rt/execution.rs:262
   2: loom::model::Builder::check
             at ./src/model.rs:208
   3: loom::model::model
             at ./src/model.rs:230
   4: nondet::nondeterministic_execution_detected
             at ./tests/nondet.rs:38
   5: nondet::nondeterministic_execution_detected::{{closure}}
             at ./tests/nondet.rs:9
   6: core::ops::function::FnOnce::call_once
             at /local/home/bdonlan/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227
   7: core::ops::function::FnOnce::call_once
             at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/core/src/ops/function.rs:227
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
test nondeterministic_execution_detected ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

@bdonlan bdonlan force-pushed the tracing branch 4 times, most recently from 03baf09 to 3484ad0 Compare October 23, 2020 22:04
@hawkw
Copy link
Member

hawkw commented Oct 23, 2020

this is extremely cool!

Bryan Donlan added 4 commits October 26, 2020 17:20
This change adds a new tracing facility that helps track down the source of
nondeterministic execution. It records, for each branch point, the original
(user) source code position leading to this branch, the operation being
performed, and the reference that is involved in the branch. If this does not
match on a later execution, we dump the mismatched operation, as well as a
trace of recent operations, let the current execution run to completion, then
panic.

A typical trace looks a bit like this:

```
===== NONDETERMINISTIC EXECUTION DETECTED =====
Previous execution:
    13: [tests/nondet.rs:55:28] loom::thread::JoinHandle<()>::join on Thread::2

Current execution:
    13: [tests/nondet.rs:55:28] loom::thread::JoinHandle<()>::join on Thread::3

Recent events:
       0: [tests/nondet.rs:17:26] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
       1: [tests/nondet.rs:18:19] loom::sync::arc::Arc<loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>>::clone on State::1
       2: [tests/nondet.rs:35:9] loom::thread::yield_now
     ...: THREAD SWITCH on Thread::1
       3: [tests/nondet.rs:21:30] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
     ...: THREAD SWITCH on Thread::0
     ...: loom::sync::mutex::MutexGuard<nondet::nondeterministic_execution_detected::State>::drop on Mutex::0
       4: [tests/nondet.rs:17:26] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
       5: [tests/nondet.rs:18:19] loom::sync::arc::Arc<loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>>::clone on State::1
       6: [tests/nondet.rs:35:9] loom::thread::yield_now
     ...: THREAD SWITCH on Thread::2
       7: [tests/nondet.rs:21:30] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
     ...: THREAD SWITCH on Thread::0
     ...: loom::sync::mutex::MutexGuard<nondet::nondeterministic_execution_detected::State>::drop on Mutex::0
       8: [tests/nondet.rs:17:26] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
       9: [tests/nondet.rs:18:19] loom::sync::arc::Arc<loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>>::clone on State::1
      10: [tests/nondet.rs:35:9] loom::thread::yield_now
     ...: THREAD SWITCH on Thread::3
      11: [tests/nondet.rs:21:30] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
     ...: THREAD SWITCH on Thread::0
     ...: loom::sync::mutex::MutexGuard<nondet::nondeterministic_execution_detected::State>::drop on Mutex::0
      12: [tests/nondet.rs:45:30] loom::sync::mutex::Mutex<nondet::nondeterministic_execution_detected::State>::lock on Mutex::0
     ...: loom::sync::mutex::MutexGuard<nondet::nondeterministic_execution_detected::State>::drop on Mutex::0

thread 'nondeterministic_execution_detected' panicked at 'Aborting due to non-deterministic execution', src/rt/execution.rs:262:13
stack backtrace:
   0: std::panicking::begin_panic
             at /local/home/bdonlan/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:497
   1: loom::rt::execution::Execution::check_consistency
             at ./src/rt/execution.rs:262
   2: loom::model::Builder::check
             at ./src/model.rs:208
   3: loom::model::model
             at ./src/model.rs:230
   4: nondet::nondeterministic_execution_detected
             at ./tests/nondet.rs:38
   5: nondet::nondeterministic_execution_detected::{{closure}}
             at ./tests/nondet.rs:9
   6: core::ops::function::FnOnce::call_once
             at /local/home/bdonlan/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227
   7: core::ops::function::FnOnce::call_once
             at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/core/src/ops/function.rs:227
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
test nondeterministic_execution_detected ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
```
This is needed for the Eq trait on `std::panic::Location`.
Panicing in drops can result in panic-in-panic events that make debugging
harder than necessary. Make an attempt to catch panics coming from e.g. the
execution being unavailable and just set a flag to cause the test to fail later
instead.
@phlip9
Copy link
Contributor

phlip9 commented Oct 29, 2020

wow this looks cool : )

random thought along this line: a "non-determinism checker" option that just executes each iteration twice with the same seed and reports if the traces diverge

@hawkw
Copy link
Member

hawkw commented Dec 3, 2021

👋 hi @bdonlan, sorry this went so long without a review --- are you still interested in working on this? i'd love to give this branch a review if you're able to update it with the latest master branch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants