Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question (epoch): How to avoid Miri to report data race on an Atomic pointer? #993

Closed
tatsuya6502 opened this issue Jun 17, 2023 · 6 comments · Fixed by #995
Closed

Question (epoch): How to avoid Miri to report data race on an Atomic pointer? #993

tatsuya6502 opened this issue Jun 17, 2023 · 6 comments · Fixed by #995

Comments

@tatsuya6502
Copy link

Hi. Some time ago, we implemented a lock-free concurrent hash table using crossbeam-epoch and it has been working fine. However, a user of our hash table library discovered that running cargo +nightly miri test on it reports a data race error during concurrent read/write on a crossbeam_epoch::Atomic pointer.

Since it is a lock-free data structure, we expect such a data race to occur. Readers may see stale data, which is okay, and reading the data will never cause use-after-free problem because the data is retained by crossbeam_epoch's Guard.

However, Miri's error raised some questions:

  • Are we using Atomic pointer and Guard correctly?
  • If yes, how can we avoid Miri to report the error?
  • If no, how can we fix our code?

Below is a minimized version of the code that causes the same Miri error:

src/bin/main1.rs
// Cargo.toml
//
// [dependencies]
// crossbeam-epoch = "0.9.15"
// rand = { version = "0.8.5", features = ["small_rng"] }

fn main() {}

#[cfg(test)]
mod tests {
    use std::{
        sync::{atomic::Ordering, Arc},
        thread,
        time::Duration,
    };

    use crossbeam_epoch::{pin, Atomic, CompareExchangeError, Owned};
    use rand::{rngs::SmallRng, Rng, SeedableRng};

    struct Container {
        v: Atomic<usize>,
    }

    impl Drop for Container {
        fn drop(&mut self) {
            let guard = pin();
            let v_ptr = self.v.load_consume(&guard);
            if unsafe { v_ptr.as_ref() }.is_some() {
                unsafe {
                    guard.defer_destroy(v_ptr);
                }
            }
        }
    }

    #[test]
    fn concurrent_read_and_modify() {
        const NUM_THREADS: usize = 16;
        const NUM_OPERATIONS: usize = 100;

        let container = Arc::new(Container { v: Atomic::null() });
        let mut thread_rng = rand::thread_rng();

        let handles = (0..NUM_THREADS)
            .map(|n| {
                let data = Arc::clone(&container);
                let mut rng = SmallRng::from_rng(&mut thread_rng).unwrap();

                thread::Builder::new()
                    .name(format!("thread-{}", n))
                    .spawn(move || {
                        for _ in 0..NUM_OPERATIONS {
                            read_and_maybe_replace(&data, &mut rng);
                        }
                    })
                    .unwrap()
            })
            .collect::<Vec<_>>();

        handles.into_iter().for_each(|h| h.join().unwrap());
    }

    fn read_and_maybe_replace(container: &Container, rng: &mut SmallRng) {
        let guard = pin();

        // Read the usize.
        let v_ptr = container.v.load_consume(&guard);
        if let Some(v_ref) = unsafe { v_ptr.as_ref() } {
            thread::sleep(Duration::from_millis(rng.gen_range(0..50)));
            std::hint::black_box({
                let _ = *v_ref;
            });
        }

        if rng.gen_bool(0.25) {
            // Replace the usize.
            let v = rng.gen();
            let mut v_ptr = v_ptr;
            loop {
                match container.v.compare_exchange_weak(
                    v_ptr,
                    Owned::new(v),
                    Ordering::AcqRel,
                    Ordering::Relaxed,
                    &guard,
                ) {
                    Ok(old_v) => {
                        unsafe {
                            guard.defer_destroy(old_v);
                        }
                        break;
                    }
                    Err(CompareExchangeError { .. }) => {
                        v_ptr = container.v.load_consume(&guard);
                    }
                }
            }
        }
    }
}

Here is the result:

Miri test
$ cargo +nightly miri test --bin main1

(Omitted some warnings about integer-to-pointer cast)

error: Undefined Behavior: Data race detected between (1) Write on thread `thread-1` and (2) Read on thread `thread-2` at alloc106668. (2) just happened here
   --> /Users/tatsuya/.cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-epoch-0.9.15/src/atomic.rs:204:9
    |
204 |         &*(ptr as *const T)
    |         ^^^^^^^^^^^^^^^^^^^ Data race detected between (1) Write on thread `thread-1` and (2) Read on thread `thread-2` at alloc106668. (2) just happened here
    |
help: and (1) occurred earlier here
   --> src/bin/main1.rs:82:21
    |
82  |                     Owned::new(v),
    |                     ^^^^^^^^^^^^^
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
    = note: BACKTRACE (of the first span):
    = note: inside `<usize as crossbeam_epoch::Pointable>::deref::<'_>` at /Users/tatsuya/.cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-epoch-0.9.15/src/atomic.rs:204:9: 204:28
    = note: inside `crossbeam_epoch::Shared::<'_, usize>::as_ref` at /Users/tatsuya/.cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-epoch-0.9.15/src/atomic.rs:1518:18: 1518:31
note: inside `tests::read_and_maybe_replace`
   --> src/bin/main1.rs:68:39
    |
68  |         if let Some(v_ref) = unsafe { v_ptr.as_ref() } {
    |                                       ^^^^^^^^^^^^^^
note: inside closure
   --> src/bin/main1.rs:53:29
    |
53  | ...                   read_and_maybe_replace(&data, &mut rng);
    |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to previous error; 5 warnings emitted

I replaced v_ref: &usize with *const usize, but Miri reported the same error on the line let _ = unsafe { *v_raw.ptr } of our src/bin/main2.rs

diff
--- src/bin/main1.rs	2023-06-17 17:37:09
+++ src/bin/main2.rs	2023-06-17 17:40:15
@@ -14,7 +14,7 @@
         time::Duration,
     };
 
-    use crossbeam_epoch::{pin, Atomic, CompareExchangeError, Owned};
+    use crossbeam_epoch::{pin, Atomic, CompareExchangeError, Guard, Owned, Shared};
     use rand::{rngs::SmallRng, Rng, SeedableRng};
 
     struct Container {
@@ -33,6 +33,20 @@
         }
     }
 
+    struct GuardedPointer<'g> {
+        ptr: *const usize,
+        _guard: &'g Guard,
+    }
+
+    impl<'g> GuardedPointer<'g> {
+        fn new(shared: Shared<'g, usize>, guard: &'g Guard) -> Self {
+            Self {
+                ptr: shared.as_raw(),
+                _guard: guard,
+            }
+        }
+    }
+
     #[test]
     fn concurrent_read_and_modify() {
         const NUM_THREADS: usize = 16;
@@ -65,10 +79,11 @@
 
         // Read the usize.
         let v_ptr = container.v.load_consume(&guard);
-        if let Some(v_ref) = unsafe { v_ptr.as_ref() } {
+        let v_raw = GuardedPointer::new(v_ptr, &guard);
+        if !v_raw.ptr.is_null() {
             thread::sleep(Duration::from_millis(rng.gen_range(0..50)));
             std::hint::black_box({
-                let _ = *v_ref;
+                let _ = unsafe { *v_raw.ptr };
             });
         }

Versions:

  • crossbeam-epoch v0.9.15
  • rustc 1.72.0-nightly (6bba06146 2023-06-16)
  • miri 0.1.0 (6bba0614 2023-06-16)
@taiki-e
Copy link
Member

taiki-e commented Jun 17, 2023

What platform did you run the test? (I'm thinking about #992)

@tatsuya6502
Copy link
Author

macOS arm64

$ rustc +nightly -Vv
rustc 1.72.0-nightly (6bba06146 2023-06-16)
binary: rustc
commit-hash: 6bba061467f7c2cab04b262b95eb67bf89265587
commit-date: 2023-06-16
host: aarch64-apple-darwin
release: 1.72.0-nightly
LLVM version: 16.0.5

@tatsuya6502
Copy link
Author

Thanks. Yeah. It seems #992 is related. I tried the following dependency and the error has changed.

crossbeam-epoch = { git = "https://github.com/crossbeam-rs/crossbeam.git", branch = "taiki-e/consume" }

So, Miri does not seem to understand load_consume impl.

Now it is detecting a stacked borrow violation...

error: Undefined Behavior: trying to retag from <221231> for SharedReadWrite permission at alloc96329[0x8], but that tag does not exist in the borrow stack for this location
   --> /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/internal.rs:552:9
    |
552 |         &*local_ptr
    |         ^^^^^^^^^^^
    |         |
    |         trying to retag from <221231> for SharedReadWrite permission at alloc96329[0x8], but that tag does not exist in the borrow stack for this location
    |         this error occurs as part of retag at alloc96329[0x0..0xb8]
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <221231> was created by a SharedReadWrite retag at offsets [0x0..0x8]
   --> src/bin/main1.rs:64:21
    |
64  |         let guard = pin();
    |                     ^^^^^
    = note: BACKTRACE (of the first span):
    = note: inside `<crossbeam_epoch::internal::Local as crossbeam_epoch::sync::list::IsElement<crossbeam_epoch::internal::Local>>::element_of` at /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/internal.rs:552:9: 552:20
    = note: inside `<crossbeam_epoch::sync::list::Iter<'_, crossbeam_epoch::internal::Local, crossbeam_epoch::internal::Local> as std::iter::Iterator>::next` at /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/sync/list.rs:290:37: 290:53
    = note: inside `crossbeam_epoch::internal::Global::try_advance` at /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/internal.rs:237:22: 237:45
    = note: inside `crossbeam_epoch::internal::Global::collect` at /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/internal.rs:202:28: 202:51
    = note: inside `crossbeam_epoch::internal::Local::pin` at /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/internal.rs:436:17: 436:46
    = note: inside `crossbeam_epoch::LocalHandle::pin` at /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/collector.rs:81:18: 81:37
    = note: inside closure at /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/default.rs:40:26: 40:38
    = note: inside closure at /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/default.rs:60:23: 60:27
    = note: inside `std::thread::LocalKey::<crossbeam_epoch::LocalHandle>::try_with::<[closure@crossbeam_epoch::default::with_handle<[closure@crossbeam_epoch::pin::{closure#0}], crossbeam_epoch::Guard>::{closure#0}], crossbeam_epoch::Guard>` at /Users/tatsuya/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/thread/local.rs:270:16: 270:31
    = note: inside `crossbeam_epoch::default::with_handle::<[closure@crossbeam_epoch::pin::{closure#0}], crossbeam_epoch::Guard>` at /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/default.rs:59:5: 60:28
    = note: inside `crossbeam_epoch::pin` at /Users/tatsuya/.cargo/git/checkouts/crossbeam-5d5b005504a37dac/39da3f8/crossbeam-epoch/src/default.rs:40:5: 40:39
note: inside `tests::read_and_maybe_replace`
   --> src/bin/main1.rs:64:21
    |
64  |         let guard = pin();
    |                     ^^^^^

@taiki-e
Copy link
Member

taiki-e commented Jun 17, 2023

SB errors from epoch should be fixed in #871.

@tatsuya6502
Copy link
Author

Wow! That was super quick. Thank you so much!

I tried epoch-fix-sb-violations branch and found that Miri no longer reports any errors and warnings on macOS arm64 🎉

$ rustc +nightly -Vv
rustc 1.72.0-nightly (6bba06146 2023-06-16)
binary: rustc
commit-hash: 6bba061467f7c2cab04b262b95eb67bf89265587
commit-date: 2023-06-16
host: aarch64-apple-darwin
release: 1.72.0-nightly
LLVM version: 16.0.5

$ cargo +nightly miri test --bin main1
Preparing a sysroot for Miri (target: aarch64-apple-darwin)... done
WARNING: Ignoring `RUSTC_WRAPPER` environment variable, Miri does not support wrapping.
    Finished test [unoptimized + debuginfo] target(s) in 0.04s
     Running unittests src/bin/main1.rs (target/miri/aarch64-apple-darwin/debug/deps/main1-5aa366f684d43212)

running 1 test
test tests::concurrent_read_and_modify ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

@tatsuya6502
Copy link
Author

tatsuya6502 commented Jun 18, 2023

p.s.

I ran miri test on my library and confirmed Miri no longer emits errors and warnings.

Cargo.toml

crossbeam-epoch = { git = "https://github.com/crossbeam-rs/crossbeam.git", branch = "master", optional = true }

Terminal Log

$ cargo tree -i crossbeam-epoch
crossbeam-epoch v0.9.15 (https://github.com/crossbeam-rs/crossbeam.git?branch=master#ce31c186)
└── moka v0.11.2 (...)

$ MIRIFLAGS="-Zmiri-tree-borrows" cargo +nightly miri test --lib cht::segment::tests::concurrent_overlapped_removal
Preparing a sysroot for Miri (target: aarch64-apple-darwin)... done
WARNING: Ignoring `RUSTC_WRAPPER` environment variable, Miri does not support wrapping.
    Finished test [unoptimized + debuginfo] target(s) in 0.08s
     Running unittests src/lib.rs (target/miri/aarch64-apple-darwin/debug/deps/moka-451ea5dfd08c9c07)

running 1 test
test cht::segment::tests::concurrent_overlapped_removal ... ^C

## Pressed Ctrl + C after ~1 minute. Otherwise it can take few hours to complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants