-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MaybeUninit: Document UnsafeCell byte ranges #121215
Conversation
rustbot has assigned @Mark-Simulacrum. Use r? to explicitly pick a reviewer |
/// ## `UnsafeCell` | ||
/// | ||
/// As a result of having the same layout as `T`, `MaybeUninit<T>` has [`UnsafeCell`]s covering | ||
/// the same byte ranges as `T`. | ||
/// | ||
/// [`UnsafeCell`]: crate::cell::UnsafeCell |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh? what does this mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's another way of expressing the same thing: for every byte offset, o
, it is either the case that the byte at offset o
in T
is nested in an UnsafeCell
and the byte at offset o
in MaybeUninit<T>
is nested in an UnsafeCell
, or neither is. I'd definitely welcome suggestions on how to better word this.
The reason this is an important property is that if you operate on a given memory location using an UnsafeCell
-typed reference at one point, and using a non-UnsafeCell
-typed reference at another point, it's UB. In unsafe code, converting from &T
to &MaybeUninit<T>
(or vice versa) requires this property in order to prove that it's a sound conversion.
For example, consider this code:
#[repr(C)]
struct A {
one: u8,
two: UnsafeCell<[u8; 2]>,
}
#[repr(C)]
struct B {
one: UnsafeCell<[u8; 2]>,
two: u8,
}
fn main() {
let a = A { one: 1, two: UnsafeCell::new([2, 3]) };
let a_ptr: *const A = &a;
let b = unsafe { &*(a_ptr as *const B) };
println!("{:?}", unsafe { *b.one.get() });
}
If we run this under Miri, it detects UB:
error: Undefined Behavior: trying to retag from <1723> for SharedReadWrite permission at alloc816[0x0], but that tag only grants SharedReadOnly permission for this location
--> src/main.rs:18:22
|
18 | let b = unsafe { &*(a_ptr as *const B) };
| ^^^^^^^^^^^^^^^^^^^^^
| |
| trying to retag from <1723> for SharedReadWrite permission at alloc816[0x0], but that tag only grants SharedReadOnly permission for this location
| this error occurs as part of retag at alloc816[0x0..0x3]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1723> was created by a SharedReadOnly retag at offsets [0x0..0x1]
--> src/main.rs:17:27
|
17 | let a_ptr: *const A = &a;
| ^^
= note: BACKTRACE (of the first span):
= note: inside `main` at src/main.rs:18:22: 18:43
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's another way of expressing the same thing: for every byte offset, o, it is either the case that the byte at offset o in T is nested in an UnsafeCell and the byte at offset o in MaybeUninit is nested in an UnsafeCell, or neither is. I'd definitely welcome suggestions on how to better word this.
OH! I see. I had to read that a few times but I think I get what you mean.
Perhaps something like:
/// ## `UnsafeCell` | |
/// | |
/// As a result of having the same layout as `T`, `MaybeUninit<T>` has [`UnsafeCell`]s covering | |
/// the same byte ranges as `T`. | |
/// | |
/// [`UnsafeCell`]: crate::cell::UnsafeCell | |
/// ## `MaybeUninit<UnsafeCell<Type>>` | |
/// | |
/// For any `T` that has interior mutability due to having [`UnsafeCell`] wrapping some | |
/// or all parts of its layout, then `MaybeUninit<T>` also has interior mutability for | |
/// the same parts of its layout. | |
/// | |
/// [`UnsafeCell`]: crate::cell::UnsafeCell |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@workingjubilee's suggestion looks good to me.
/// `MaybeUninit<T>` is guaranteed to have the same layout (size, alignment, and field offsets and layouts) | ||
/// and ABI as `T`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "field offsets and layouts" part is redundant because layouts are compositional. If the size and alignment is the same as T then by necessity the offset of T
within MaybeUninit
is zero. And any specific T
is a T
, so its field offsets are its field offsets.
This is no different than #[repr(transparent)] struct Wrapper(MyStruct)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think this is overly complicating things and feels more confusing to me than just saying that the size and alignment are the same. MaybeUninit is usually treated as having a single field itself (T) so it's confusing to me to talk about field offsets and layouts, especially after size/alignment (which is all a layout is).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem as I see it is twofold:
-
While we technically define "layout" to refer to size, alignment, and field offset/layout [1], it's colloquially often used to refer only to size and alignment, and so I think being verbose may help here
-
While const generics aren't stable today, you could define
MaybeUninit
as follows, which would be consistent with the size and alignment properties, but would not be consistent with the field offset/layout properties:union MaybeUninit<T> { uninit: (), value: [u8; size_of::<T>()], }
[1] I know this was officially documented within the past year, but I can't figure out where we did that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- While const generics aren't stable today, you could define
MaybeUninit
as follows, which would be consistent with the size and alignment properties, but would not be consistent with the field offset/layout properties:union MaybeUninit<T> { uninit: (), value: [u8; size_of::<T>()], }
This would be a breaking change because the alignment of this type would be the same as u8 and not the same as T. playground
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That type also differs from the true MaybeUninit
in terms of preserving all padding bytes (which true MaybeUninit
does not preserve).
I'm going to hold off on working on this/trying to land it until rust-lang/unsafe-code-guidelines#495 is resolved. There are a number of questions there which will affect what wording is useful here. |
@joshlf this is now unblocked as the blocked pr is now merged, so we can proceed with this |
@joshlf |
No description provided.