Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miri false positive data race due to incorrect handling of fences #2192

Closed
saethlin opened this issue Jun 5, 2022 · 7 comments
Closed

Miri false positive data race due to incorrect handling of fences #2192

saethlin opened this issue Jun 5, 2022 · 7 comments
Labels
A-data-race Area: data race detector C-support Category: Not necessarily a bug, but someone asking for support

Comments

@saethlin
Copy link
Member

saethlin commented Jun 5, 2022

I don't actually understand any of what's going on in this case, just reporting this: crossbeam-rs/crossbeam#838 (comment)

cc @kprotty in case I've summarized this wrong or you have a simple test case for this on hand

@RalfJung
Copy link
Member

RalfJung commented Jun 5, 2022

Cc @JCTyblaidd who wrote our data race detector

@RalfJung RalfJung added the A-data-race Area: data race detector label Jun 5, 2022
@alygin
Copy link

alygin commented Jun 5, 2022

Here's a simple test based on the crossbeam-channel case:

use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering, fence};
use std::time::Duration;
use std::thread;

#[test]
fn data_race() {
    static mut V: u32 = 0;
    let a = Arc::new(AtomicUsize::default());
    let b = a.clone();
    thread::spawn(move || {
        unsafe { V = 1 }
        b.store(1, Ordering::SeqCst);
    });
    thread::sleep(Duration::from_millis(100));
    fence(Ordering::SeqCst);
    a.load(Ordering::Relaxed);
    unsafe { V = 2 }
}

When executed with MIRIFLAGS="-Zmiri-disable-isolation" cargo miri test, it produces:

error: Undefined Behavior: Data race detected between Write on Thread(id = 0, name = "main") and Write on Thread(id = 1) at alloc1 (current vector clock = VClock([5]), conflicting timestamp = VClock([0, 8]))
  --> src/lib.rs:21:18
   |
21 |         unsafe { V = 2 }
   |                  ^^^^^ Data race detected between Write on Thread(id = 0, name = "main") and Write on Thread(id = 1) at alloc1 (current vector clock = VClock([5]), conflicting timestamp = VClock([0, 8]))
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
           
   = note: inside `tests::data_race` at src/lib.rs:21:18
note: inside closure at src/lib.rs:10:5
  --> src/lib.rs:10:5
   |
9  |       #[test]
   |       ------- in this procedural macro expansion
10 | /     fn data_race() {
11 | |         static mut V: u32 = 0;
12 | |         let a = Arc::new(AtomicUsize::default());
13 | |         let b = a.clone();
...  |
21 | |         unsafe { V = 2 }
22 | |     }
   | |_____^
   = note: this error originates in the attribute macro `test` (in Nightly builds, run with -Z macro-backtrace for more info)

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to previous error; 1 warning emitted

@alygin
Copy link

alygin commented Jun 5, 2022

But to tell the truth, I'm still in doubt that this is a false positive. As far as the documentation goes (both for the fence() and in more details in the Atomic-fence synchronization chapter of the C++ memory model description), the synchronization between the atomic store and the fence is only guaranteed if the fence is sequenced-after the load.

And if we change

    fence(Ordering::SeqCst);
    a.load(Ordering::Relaxed);

to

    a.load(Ordering::Relaxed);
    fence(Ordering::SeqCst); // It can be Acquire

the data race warning is gone.

But maybe I'm missing something here.

@RalfJung RalfJung added the C-bug Category: This is a bug. label Jun 5, 2022
@RalfJung
Copy link
Member

RalfJung commented Jun 5, 2022

I think there is a race here in both cases (i.e., also the one where the warning is gone)? You are completely ignoring the result of the load, so a possible execution is

  • Thread A does fence(Ordering::SeqCst)
  • Thread A does load(Ordering::Relaxed), reads 0
  • Thread B does V = 1
  • Thread A does V = 2 -- there i no synchronization with the thread B write that just happened, so this is a race

@RalfJung
Copy link
Member

RalfJung commented Jun 5, 2022

I guess you meant something more like this where we check that the load loads 1.

But in that case, we still have this valid execution:

  1. Thread A does fence(Ordering::SeqCst)
  2. Thread B does V = 1
  3. Thread B does store(1, Ordering::SeqCst)
  4. Thread A does load(Ordering::Relaxed), reads 1
  5. Thread A does V = 2

The load at (4) is relaxed so it doesn't pick up any synchronization. Therefore the two accesses to V are not synchronized and I would say this is a data race.

@RalfJung RalfJung added C-support Category: Not necessarily a bug, but someone asking for support and removed C-bug Category: This is a bug. labels Jun 5, 2022
@kprotty
Copy link

kprotty commented Jun 5, 2022

the synchronization between the atomic store and the fence is only guaranteed if the fence is sequenced-after the load.

@alygin You're right. The correct use of the fence passes miri while having it before the load doesn't.

fn main() {
    use std::{
        sync::atomic::{fence, AtomicUsize, Ordering},
        sync::Arc,
        thread,
    };

    static mut V: u32 = 0;
    let a = Arc::new(AtomicUsize::default());
    let b = a.clone();

    thread::spawn(move || {
        unsafe { V = 1 };
        b.store(1, Ordering::SeqCst);
    });

    loop {
        if a.load(Ordering::Relaxed) == 1 {
            fence(Ordering::SeqCst);
            return unsafe { V = 2 };
        } else {
            thread::yield_now(); // for miri
        }
    }
}

After some thought, the crossbeam code is fine to race as drop() isn't necessarily a synchronization point (which the test case was trying to use as such).

@RalfJung
Copy link
Member

RalfJung commented Jun 5, 2022

Closing as works-as-intended then. And I'll add this as a testcase. :)

@RalfJung RalfJung closed this as not planned Won't fix, can't repro, duplicate, stale Jun 5, 2022
bors added a commit that referenced this issue Jun 5, 2022
add interesting data race test

This interesting testcase came up in #2192.
bors added a commit that referenced this issue Jun 5, 2022
add interesting data race test

This interesting testcase came up in #2192.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-data-race Area: data race detector C-support Category: Not necessarily a bug, but someone asking for support
Projects
None yet
Development

No branches or pull requests

4 participants