rust-lang
diff --git a/‎library/core/src/cell.rs
+6-4 b/‎library/core/src/cell.rs
+6-4
diff --git a/‎library/core/src/sync/atomic.rs
+50-31 b/‎library/core/src/sync/atomic.rs
+50-31
diff --git a/‎src/tools/miri/src/concurrency/data_race.rs
+33-30 b/‎src/tools/miri/src/concurrency/data_race.rs
+33-30
diff --git a/‎src/tools/miri/tests/fail/data_race/mixed_size_read.rs
-28 b/‎src/tools/miri/tests/fail/data_race/mixed_size_read.rs
-28
@@ -1895,11 +1895,17 @@ impl<T: ?Sized + fmt::Display> fmt::Display for RefMut<'_, T> {
 /// uniqueness guarantee for mutable references is unaffected. There is *no* legal way to obtain
 /// aliasing `&mut`, not even with `UnsafeCell<T>`.
 ///
+/// `UnsafeCell` does nothing to avoid data races; they are still undefined behavior. If multiple
+/// threads have access to the same `UnsafeCell`, they must follow the usual rules of the
+/// [concurrent memory model]: conflicting non-synchronized accesses must be done via the APIs in
+/// [`core::sync::atomic`].
+///
 /// The `UnsafeCell` API itself is technically very simple: [`.get()`] gives you a raw pointer
 /// `*mut T` to its contents. It is up to _you_ as the abstraction designer to use that raw pointer
 /// correctly.
 ///
 /// [`.get()`]: `UnsafeCell::get`
+/// [concurrent memory model]: ../sync/atomic/index.html#memory-model-for-atomic-accesses
 ///
 /// The precise Rust aliasing rules are somewhat in flux, but the main points are not contentious:
 ///
@@ -1922,10 +1928,6 @@ impl<T: ?Sized + fmt::Display> fmt::Display for RefMut<'_, T> {
 /// live memory and the compiler is allowed to insert spurious reads if it can prove that this
 /// memory has not yet been deallocated.
 ///
-/// - At all times, you must avoid data races. If multiple threads have access to
-/// the same `UnsafeCell`, then any writes must have a proper happens-before relation to all other
-/// accesses (or use atomics).
-///
 /// To assist with proper design, the following scenarios are explicitly declared legal
 /// for single-threaded code:
 ///
 
@@ -24,25 +24,42 @@
 //!
 //! ## Memory model for atomic accesses
 //!
-//! Rust atomics currently follow the same rules as [C++20 atomics][cpp], specifically `atomic_ref`.
-//! Basically, creating a *shared reference* to one of the Rust atomic types corresponds to creating
-//! an `atomic_ref` in C++; the `atomic_ref` is destroyed when the lifetime of the shared reference
-//! ends. A Rust atomic type that is exclusively owned or behind a mutable reference does *not*
-//! correspond to an “atomic object” in C++, since the underlying primitive can be mutably accessed,
-//! for example with `get_mut`, to perform non-atomic operations.
+//! Rust atomics currently follow the same rules as [C++20 atomics][cpp], specifically the rules
+//! from the [`intro.races`][cpp-intro.races] section, without the "consume" memory ordering. Since
+//! C++ uses an object-based memory model whereas Rust is access-based, a bit of translation work
+//! has to be done to apply the C++ rules to Rust: whenever C++ talks about "the value of an
+//! object", we understand that to mean the resulting bytes obtained when doing a read. When the C++
+//! standard talks about "the value of an atomic object", this refers to the result of doing an
+//! atomic load (via the operations provided in this module). A "modification of an atomic object"
+//! refers to an atomic store.
 //!
-//! [cpp]: https://en.cppreference.com/w/cpp/atomic
+//! The end result is *almost* equivalent to saying that creating a *shared reference* to one of the
+//! Rust atomic types corresponds to creating an `atomic_ref` in C++, with the `atomic_ref` being
+//! destroyed when the lifetime of the shared reference ends. The main difference is that Rust
+//! permits concurrent atomic and non-atomic reads to the same memory as those cause no issue in the
+//! C++ memory model, they are just forbidden in C++ because memory is partitioned into "atomic
+//! objects" and "non-atomic objects" (with `atomic_ref` temporarily converting a non-atomic object
+//! into an atomic object).
+//!
+//! The most important aspect of this model is that *data races* are undefined behavior. A data race
+//! is defined as conflicting non-synchronized accesses where at least one of the accesses is
+//! non-atomic. Here, accesses are *conflicting* if they affect overlapping regions of memory and at
+//! least one of them is a write. They are *non-synchronized* if neither of them *happens-before*
+//! the other, according to the happens-before order of the memory model.
 //!
-//! Each method takes an [`Ordering`] which represents the strength of
-//! the memory barrier for that operation. These orderings are the
-//! same as the [C++20 atomic orderings][1]. For more information see the [nomicon][2].
+//! The other possible cause of undefined behavior in the memory model are mixed-size accesses: Rust
+//! inherits the C++ limitation that non-synchronized conflicting atomic accesses may not partially
+//! overlap. In other words, every pair of non-synchronized atomic accesses must be either disjoint,
+//! access the exact same memory (including using the same access size), or both be reads.
 //!
-//! [1]: https://en.cppreference.com/w/cpp/atomic/memory_order
-//! [2]: ../../../nomicon/atomics.html
+//! Each atomic access takes an [`Ordering`] which defines how the operation interacts with the
+//! happens-before order. These orderings behave the same as the corresponding [C++20 atomic
+//! orderings][cpp_memory_order]. For more information, see the [nomicon].
 //!
-//! Since C++ does not support mixing atomic and non-atomic accesses, or non-synchronized
-//! different-sized accesses to the same data, Rust does not support those operations either.
-//! Note that both of those restrictions only apply if the accesses are non-synchronized.
+//! [cpp]: https://en.cppreference.com/w/cpp/atomic
+//! [cpp-intro.races]: https://timsong-cpp.github.io/cppwp/n4868/intro.multithread#intro.races
+//! [cpp_memory_order]: https://en.cppreference.com/w/cpp/atomic/memory_order
+//! [nomicon]: ../../../nomicon/atomics.html
 //!
 //! ```rust,no_run undefined_behavior
 //! use std::sync::atomic::{AtomicU16, AtomicU8, Ordering};
@@ -52,27 +69,29 @@
 //! let atomic = AtomicU16::new(0);
 //!
 //! thread::scope(|s| {
-//!     // This is UB: mixing atomic and non-atomic accesses
-//!     s.spawn(|| atomic.store(1, Ordering::Relaxed));
-//!     s.spawn(|| unsafe { atomic.as_ptr().write(2) });
+//!     // This is UB: conflicting non-synchronized accesses, at least one of which is non-atomic.
+//!     s.spawn(|| atomic.store(1, Ordering::Relaxed)); // atomic store
+//!     s.spawn(|| unsafe { atomic.as_ptr().write(2) }); // non-atomic write
 //! });
 //!
 //! thread::scope(|s| {
-//!     // This is UB: even reads are not allowed to be mixed
-//!     s.spawn(|| atomic.load(Ordering::Relaxed));
-//!     s.spawn(|| unsafe { atomic.as_ptr().read() });
+//!     // This is fine: the accesses do not conflict (as none of them performs any modification).
+//!     // In C++ this would be disallowed since creating an `atomic_ref` precludes
+//!     // further non-atomic accesses, but Rust does not have that limitation.
+//!     s.spawn(|| atomic.load(Ordering::Relaxed)); // atomic load
+//!     s.spawn(|| unsafe { atomic.as_ptr().read() }); // non-atomic read
 //! });
 //!
 //! thread::scope(|s| {
-//!     // This is fine, `join` synchronizes the code in a way such that atomic
-//!     // and non-atomic accesses can't happen "at the same time"
-//!     let handle = s.spawn(|| atomic.store(1, Ordering::Relaxed));
-//!     handle.join().unwrap();
-//!     s.spawn(|| unsafe { atomic.as_ptr().write(2) });
+//!     // This is fine: `join` synchronizes the code in a way such that the atomic
+//!     // store happens-before the non-atomic write.
+//!     let handle = s.spawn(|| atomic.store(1, Ordering::Relaxed)); // atomic store
+//!     handle.join().unwrap(); // synchronize
+//!     s.spawn(|| unsafe { atomic.as_ptr().write(2) }); // non-atomic write
 //! });
 //!
 //! thread::scope(|s| {
-//!     // This is UB: using different-sized atomic accesses to the same data
+//!     // This is UB: non-synchronized conflicting differently-sized atomic accesses.
 //!     s.spawn(|| atomic.store(1, Ordering::Relaxed));
 //!     s.spawn(|| unsafe {
 //!         let differently_sized = transmute::<&AtomicU16, &AtomicU8>(&atomic);
@@ -81,8 +100,8 @@
 //! });
 //!
 //! thread::scope(|s| {
-//!     // This is fine, `join` synchronizes the code in a way such that
-//!     // differently-sized accesses can't happen "at the same time"
+//!     // This is fine: `join` synchronizes the code in a way such that
+//!     // the 1-byte store happens-before the 2-byte store.
 //!     let handle = s.spawn(|| atomic.store(1, Ordering::Relaxed));
 //!     handle.join().unwrap();
 //!     s.spawn(|| unsafe {
@@ -137,7 +156,7 @@
 //!
 //! # Atomic accesses to read-only memory
 //!
-//! In general, *all* atomic accesses on read-only memory are Undefined Behavior. For instance, attempting
+//! In general, *all* atomic accesses on read-only memory are undefined behavior. For instance, attempting
 //! to do a `compare_exchange` that will definitely fail (making it conceptually a read-only
 //! operation) can still cause a segmentation fault if the underlying memory page is mapped read-only. Since
 //! atomic `load`s might be implemented using compare-exchange operations, even a `load` can fault
@@ -153,7 +172,7 @@
 //!
 //! As an exception from the general rule stated above, "sufficiently small" atomic loads with
 //! `Ordering::Relaxed` are implemented in a way that works on read-only memory, and are hence not
-//! Undefined Behavior. The exact size limit for what makes a load "sufficiently small" varies
+//! undefined behavior. The exact size limit for what makes a load "sufficiently small" varies
 //! depending on the target:
 //!
 //! | `target_arch` | Size limit |
 
@@ -191,7 +191,8 @@ struct AtomicMemoryCellClocks {
     /// The size of accesses to this atomic location.
     /// We use this to detect non-synchronized mixed-size accesses. Since all accesses must be
     /// aligned to their size, this is sufficient to detect imperfectly overlapping accesses.
-    size: Size,
+    /// `None` indicates that we saw multiple different sizes, which is okay as long as all accesses are reads.
+    size: Option<Size>,
 }
 
 #[derive(Copy, Clone, PartialEq, Eq, Debug)]
@@ -265,6 +266,14 @@ impl AccessType {
         let mut msg = String::new();
 
         if let Some(size) = size {
+            if size == Size::ZERO {
+                // In this case there were multiple read accesss with different sizes and then a write.
+                // We will be reporting *one* of the other reads, but we don't have enough information
+                // to determine which one had which size.
+                assert!(self == AccessType::AtomicLoad);
+                assert!(ty.is_none());
+                return format!("multiple differently-sized atomic loads, including one load");
+            }
             msg.push_str(&format!("{}-byte {}", size.bytes(), msg))
         }
 
@@ -305,8 +314,7 @@ impl AccessType {
     }
 }
 
-/// Memory Cell vector clock metadata
-/// for data-race detection.
+/// Per-byte vector clock metadata for data-race detection.
 #[derive(Clone, PartialEq, Eq, Debug)]
 struct MemoryCellClocks {
     /// The vector-clock timestamp and the thread that did the last non-atomic write. We don't need
@@ -325,8 +333,8 @@ struct MemoryCellClocks {
     read: VClock,
 
     /// Atomic access, acquire, release sequence tracking clocks.
-    /// For non-atomic memory in the common case this
-    /// value is set to None.
+    /// For non-atomic memory this value is set to None.
+    /// For atomic memory, each byte carries this information.
     atomic_ops: Option<Box<AtomicMemoryCellClocks>>,
 }
 
@@ -336,7 +344,7 @@ impl AtomicMemoryCellClocks {
             read_vector: Default::default(),
             write_vector: Default::default(),
             sync_vector: Default::default(),
-            size,
+            size: Some(size),
         }
     }
 }
@@ -383,17 +391,23 @@ impl MemoryCellClocks {
         &mut self,
         thread_clocks: &ThreadClockSet,
         size: Size,
+        write: bool,
     ) -> Result<&mut AtomicMemoryCellClocks, DataRace> {
         match self.atomic_ops {
             Some(ref mut atomic) => {
                 // We are good if the size is the same or all atomic accesses are before our current time.
-                if atomic.size == size {
+                if atomic.size == Some(size) {
                     Ok(atomic)
                 } else if atomic.read_vector <= thread_clocks.clock
                     && atomic.write_vector <= thread_clocks.clock
                 {
-                    // This is now the new size that must be used for accesses here.
-                    atomic.size = size;
+                    // We are fully ordered after all previous accesses, so we can change the size.
+                    atomic.size = Some(size);
+                    Ok(atomic)
+                } else if !write && atomic.write_vector <= thread_clocks.clock {
+                    // This is a read, and it is ordered after the last write. It's okay for the
+                    // sizes to mismatch, as long as no writes with a different size occur later.
+                    atomic.size = None;
                     Ok(atomic)
                 } else {
                     Err(DataRace)
@@ -499,7 +513,7 @@ impl MemoryCellClocks {
         Ok(())
     }
 
-    /// Detect data-races with an atomic read, caused by a non-atomic access that does
+    /// Detect data-races with an atomic read, caused by a non-atomic write that does
     /// not happen-before the atomic-read.
     fn atomic_read_detect(
         &mut self,
@@ -508,14 +522,10 @@ impl MemoryCellClocks {
         access_size: Size,
     ) -> Result<(), DataRace> {
         trace!("Atomic read with vectors: {:#?} :: {:#?}", self, thread_clocks);
-        let atomic = self.atomic_access(thread_clocks, access_size)?;
+        let atomic = self.atomic_access(thread_clocks, access_size, /*write*/ false)?;
         atomic.read_vector.set_at_index(&thread_clocks.clock, index);
-        // Make sure the last non-atomic write and all non-atomic reads were before this access.
-        if self.write_was_before(&thread_clocks.clock) && self.read <= thread_clocks.clock {
-            Ok(())
-        } else {
-            Err(DataRace)
-        }
+        // Make sure the last non-atomic write was before this access.
+        if self.write_was_before(&thread_clocks.clock) { Ok(()) } else { Err(DataRace) }
     }
 
     /// Detect data-races with an atomic write, either with a non-atomic read or with
@@ -527,7 +537,7 @@ impl MemoryCellClocks {
         access_size: Size,
     ) -> Result<(), DataRace> {
         trace!("Atomic write with vectors: {:#?} :: {:#?}", self, thread_clocks);
-        let atomic = self.atomic_access(thread_clocks, access_size)?;
+        let atomic = self.atomic_access(thread_clocks, access_size, /*write*/ true)?;
         atomic.write_vector.set_at_index(&thread_clocks.clock, index);
         // Make sure the last non-atomic write and all non-atomic reads were before this access.
         if self.write_was_before(&thread_clocks.clock) && self.read <= thread_clocks.clock {
@@ -552,11 +562,9 @@ impl MemoryCellClocks {
         }
         thread_clocks.clock.index_mut(index).set_read_type(read_type);
         if self.write_was_before(&thread_clocks.clock) {
+            // We must be ordered-after all atomic writes.
             let race_free = if let Some(atomic) = self.atomic() {
-                // We must be ordered-after all atomic accesses, reads and writes.
-                // This ensures we don't mix atomic and non-atomic accesses.
                 atomic.write_vector <= thread_clocks.clock
-                    && atomic.read_vector <= thread_clocks.clock
             } else {
                 true
             };
@@ -957,9 +965,7 @@ impl VClockAlloc {
         let mut other_size = None; // if `Some`, this was a size-mismatch race
         let write_clock;
         let (other_access, other_thread, other_clock) =
-            // First check the atomic-nonatomic cases. If it looks like multiple
-            // cases apply, this one should take precedence, else it might look like
-            // we are reporting races between two non-atomic reads.
+            // First check the atomic-nonatomic cases.
             if !access.is_atomic() &&
                 let Some(atomic) = mem_clocks.atomic() &&
                 let Some(idx) = Self::find_gt_index(&atomic.write_vector, &active_clocks.clock)
@@ -977,10 +983,10 @@ impl VClockAlloc {
             } else if let Some(idx) = Self::find_gt_index(&mem_clocks.read, &active_clocks.clock) {
                 (AccessType::NaRead(mem_clocks.read[idx].read_type()), idx, &mem_clocks.read)
             // Finally, mixed-size races.
-            } else if access.is_atomic() && let Some(atomic) = mem_clocks.atomic() && atomic.size != access_size {
+            } else if access.is_atomic() && let Some(atomic) = mem_clocks.atomic() && atomic.size != Some(access_size) {
                 // This is only a race if we are not synchronized with all atomic accesses, so find
                 // the one we are not synchronized with.
-                other_size = Some(atomic.size);
+                other_size = Some(atomic.size.unwrap_or(Size::ZERO));
                 if let Some(idx) = Self::find_gt_index(&atomic.write_vector, &active_clocks.clock)
                     {
                         (AccessType::AtomicStore, idx, &atomic.write_vector)
@@ -1007,10 +1013,7 @@ impl VClockAlloc {
             assert!(!involves_non_atomic);
             Some("overlapping unsynchronized atomic accesses must use the same access size")
         } else if access.is_read() && other_access.is_read() {
-            assert!(involves_non_atomic);
-            Some(
-                "overlapping atomic and non-atomic accesses must be synchronized, even if both are read-only",
-            )
+            panic!("there should be no same-size read-read races")
         } else {
             None
         };