Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaybeUninit: Document UnsafeCell byte ranges #121215

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion library/core/src/mem/maybe_uninit.rs
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,8 @@ use crate::slice;
///
/// # Layout
///
/// `MaybeUninit<T>` is guaranteed to have the same size, alignment, and ABI as `T`:
/// `MaybeUninit<T>` is guaranteed to have the same layout (size, alignment, and field offsets and layouts)
/// and ABI as `T`:
Comment on lines +212 to +213
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "field offsets and layouts" part is redundant because layouts are compositional. If the size and alignment is the same as T then by necessity the offset of T within MaybeUninit is zero. And any specific T is a T, so its field offsets are its field offsets.

This is no different than #[repr(transparent)] struct Wrapper(MyStruct)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is overly complicating things and feels more confusing to me than just saying that the size and alignment are the same. MaybeUninit is usually treated as having a single field itself (T) so it's confusing to me to talk about field offsets and layouts, especially after size/alignment (which is all a layout is).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem as I see it is twofold:

  • While we technically define "layout" to refer to size, alignment, and field offset/layout [1], it's colloquially often used to refer only to size and alignment, and so I think being verbose may help here

  • While const generics aren't stable today, you could define MaybeUninit as follows, which would be consistent with the size and alignment properties, but would not be consistent with the field offset/layout properties:

    union MaybeUninit<T> {
        uninit: (),
        value: [u8; size_of::<T>()],
    }

[1] I know this was officially documented within the past year, but I can't figure out where we did that...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • While const generics aren't stable today, you could define MaybeUninit as follows, which would be consistent with the size and alignment properties, but would not be consistent with the field offset/layout properties:
    union MaybeUninit<T> {
        uninit: (),
        value: [u8; size_of::<T>()],
    }

This would be a breaking change because the alignment of this type would be the same as u8 and not the same as T. playground

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That type also differs from the true MaybeUninit in terms of preserving all padding bytes (which true MaybeUninit does not preserve).

///
/// ```rust
/// use std::mem::{MaybeUninit, size_of, align_of};
Expand Down Expand Up @@ -241,6 +242,13 @@ use crate::slice;
/// remain `#[repr(transparent)]`. That said, `MaybeUninit<T>` will *always* guarantee that it has
/// the same size, alignment, and ABI as `T`; it's just that the way `MaybeUninit` implements that
/// guarantee may evolve.
///
/// ## `UnsafeCell`
///
/// As a result of having the same layout as `T`, `MaybeUninit<T>` has [`UnsafeCell`]s covering
/// the same byte ranges as `T`.
///
/// [`UnsafeCell`]: crate::cell::UnsafeCell
Comment on lines +246 to +251
Copy link
Member

@workingjubilee workingjubilee Feb 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh? what does this mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another way of expressing the same thing: for every byte offset, o, it is either the case that the byte at offset o in T is nested in an UnsafeCell and the byte at offset o in MaybeUninit<T> is nested in an UnsafeCell, or neither is. I'd definitely welcome suggestions on how to better word this.

The reason this is an important property is that if you operate on a given memory location using an UnsafeCell-typed reference at one point, and using a non-UnsafeCell-typed reference at another point, it's UB. In unsafe code, converting from &T to &MaybeUninit<T> (or vice versa) requires this property in order to prove that it's a sound conversion.

For example, consider this code:

#[repr(C)]
struct A {
    one: u8,
    two: UnsafeCell<[u8; 2]>,
}

#[repr(C)]
struct B {
    one: UnsafeCell<[u8; 2]>,
    two: u8,
}

fn main() {
    let a = A { one: 1, two: UnsafeCell::new([2, 3]) };
    let a_ptr: *const A = &a;
    let b = unsafe { &*(a_ptr as *const B) };
    println!("{:?}", unsafe { *b.one.get() });
}

If we run this under Miri, it detects UB:

error: Undefined Behavior: trying to retag from <1723> for SharedReadWrite permission at alloc816[0x0], but that tag only grants SharedReadOnly permission for this location
  --> src/main.rs:18:22
   |
18 |     let b = unsafe { &*(a_ptr as *const B) };
   |                      ^^^^^^^^^^^^^^^^^^^^^
   |                      |
   |                      trying to retag from <1723> for SharedReadWrite permission at alloc816[0x0], but that tag only grants SharedReadOnly permission for this location
   |                      this error occurs as part of retag at alloc816[0x0..0x3]
   |
   = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
   = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1723> was created by a SharedReadOnly retag at offsets [0x0..0x1]
  --> src/main.rs:17:27
   |
17 |     let a_ptr: *const A = &a;
   |                           ^^
   = note: BACKTRACE (of the first span):
   = note: inside `main` at src/main.rs:18:22: 18:43

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another way of expressing the same thing: for every byte offset, o, it is either the case that the byte at offset o in T is nested in an UnsafeCell and the byte at offset o in MaybeUninit is nested in an UnsafeCell, or neither is. I'd definitely welcome suggestions on how to better word this.

OH! I see. I had to read that a few times but I think I get what you mean.

Perhaps something like:

Suggested change
/// ## `UnsafeCell`
///
/// As a result of having the same layout as `T`, `MaybeUninit<T>` has [`UnsafeCell`]s covering
/// the same byte ranges as `T`.
///
/// [`UnsafeCell`]: crate::cell::UnsafeCell
/// ## `MaybeUninit<UnsafeCell<Type>>`
///
/// For any `T` that has interior mutability due to having [`UnsafeCell`] wrapping some
/// or all parts of its layout, then `MaybeUninit<T>` also has interior mutability for
/// the same parts of its layout.
///
/// [`UnsafeCell`]: crate::cell::UnsafeCell

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@workingjubilee's suggestion looks good to me.

#[stable(feature = "maybe_uninit", since = "1.36.0")]
// Lang item so we can wrap other types in it. This is useful for coroutines.
#[lang = "maybe_uninit"]
Expand Down
Loading