Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Bytes & BytesMut compatible with ThreadSanitizer #405

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions src/bytes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use alloc::{borrow::Borrow, boxed::Box, string::String, vec::Vec};
use crate::buf::IntoIter;
#[allow(unused)]
use crate::loom::sync::atomic::AtomicMut;
use crate::loom::sync::atomic::{self, AtomicPtr, AtomicUsize, Ordering};
use crate::loom::sync::atomic::{AtomicPtr, AtomicUsize, Ordering};
use crate::Buf;

/// A reference counted contiguous slice of memory.
Expand Down Expand Up @@ -1046,7 +1046,9 @@ unsafe fn release_shared(ptr: *mut Shared) {
// > "acquire" operation before deleting the object.
//
// [1]: (www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html)
atomic::fence(Ordering::Acquire);
//
// Use atomic load instead of fence for compatibility with ThreadSanitizer.
(*ptr).ref_cnt.load(Ordering::Acquire);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will force a second load of the ref_cnt, won't it?

Also, if we diverge from how std::arc::Arc does things, we should probably have a good reason and explain why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there will be an additional load when the reference count when counter reaches zero.

I added a comment explaining why there is a load instead of a fence, so there is no need to go through git history to understand that.

To make it clear, the only reason for doing it is compatibility with ThreadSanitizer. The std::sync::Arc is currently implemented as proposed here, although impl is used conditionally. In overall tokio ecosystem, this is last remaining false positive I had seen reported with ThreadSanitizer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@tmiasko tmiasko Jul 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfg(sanitize = "thread") is unstable so probably not. I don't think it is worth the complexity anyway.

On x86 this approach avoids a single mov from a location that was written a few instructions before, on a cold path that needs to do a bunch of work to deallocate the memory. Last time I looked this was essentially unmeasurable for any real world applications. For weak memory models where acquire fence leads to actual codegen, the situation is even more ambiguous.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use cfg(bytes_ci_tsan) or something, setting CARGO_CFG_BYTES_CI_TSAN=1 environment variable.

It seems it's at least worth enough that the standard library does it. Are they wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think approach from std is best idea. Requiring a custom cfg would be impractical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading a bit about the change in libstd (rust-lang/rust#65097), I think we should care about the performance here, and only enable a load instead of a fence via a conditional config, to be used with tsan.

Copy link
Contributor Author

@tmiasko tmiasko Jul 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is measurable outside microbenchmarks. Note that the most of that discussion is about different implementation.

If doing this conditionally is the only acceptable implementation, then I suggest closing this. Doing this conditionally serves no purpose, because if it doesn't work out of the box, it doesn't work period. The situation in std is different, because cfg is automatically enabled when tsan is used during compilation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify one point: there is no measurable impact on any benchmarks here and they do exercise this code. If you noticed anything raised your concerned I can take look again, but honestly, this change does not make any difference.


// Drop the data
Box::from_raw(ptr);
Expand Down
6 changes: 4 additions & 2 deletions src/bytes_mut.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ use crate::buf::IntoIter;
use crate::bytes::Vtable;
#[allow(unused)]
use crate::loom::sync::atomic::AtomicMut;
use crate::loom::sync::atomic::{self, AtomicPtr, AtomicUsize, Ordering};
use crate::loom::sync::atomic::{AtomicPtr, AtomicUsize, Ordering};
use crate::{Buf, BufMut, Bytes};

/// A unique reference to a contiguous slice of memory.
Expand Down Expand Up @@ -1223,7 +1223,9 @@ unsafe fn release_shared(ptr: *mut Shared) {
// > "acquire" operation before deleting the object.
//
// [1]: (www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html)
atomic::fence(Ordering::Acquire);
//
// Use atomic load instead of fence for compatibility with ThreadSanitizer.
(*ptr).ref_count.load(Ordering::Acquire);

// Drop the data
Box::from_raw(ptr);
Expand Down
4 changes: 2 additions & 2 deletions src/loom.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#[cfg(not(all(test, loom)))]
pub(crate) mod sync {
pub(crate) mod atomic {
pub(crate) use core::sync::atomic::{fence, AtomicPtr, AtomicUsize, Ordering};
pub(crate) use core::sync::atomic::{AtomicPtr, AtomicUsize, Ordering};

pub(crate) trait AtomicMut<T> {
fn with_mut<F, R>(&mut self, f: F) -> R
Expand All @@ -23,7 +23,7 @@ pub(crate) mod sync {
#[cfg(all(test, loom))]
pub(crate) mod sync {
pub(crate) mod atomic {
pub(crate) use loom::sync::atomic::{fence, AtomicPtr, AtomicUsize, Ordering};
pub(crate) use loom::sync::atomic::{AtomicPtr, AtomicUsize, Ordering};

pub(crate) trait AtomicMut<T> {}
}
Expand Down