-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it sound to check whether the bytes of an Option<&T>
are zero?
#488
Comments
If it's definitely either None or Some then it's definitely fully initialized because that is the point of Rust enums. I'm unclear why you'd do this transmute instead of matching on the value directly or using |
Our starting point is a glorified
...but we don't know the referent is a validly initialized If you're curious, here's our current stab at a proof, which involves several stages of somewhat-justified hoop-jumping in order to call |
I agree, except that IIUC there's one extra step required: all bytes of a |
Alignment seems to be the key point here? I am not sure which other part of the validity invariant of You are right to be cautious with loading this as a unsafe fn is_none<T>(ptr: *const Option<&T>) -> bool {
ptr.cast::<*const ()>().read().is_null()
} |
Ah that hadn't occurred to me! That seems much more clearly reasonable on its surface. I'm pretty sure this is sound today, but is it guaranteed to always be sound? IIUC, this relies on:
While this isn't a soundness concern, the correctness of this function relies on the fact that there is only one bit representation for |
So, are you thinking that, perhaps similar to |
Given that all bytes of |
Yeah, exactly. Obviously I don't actually think that'd ever happen, but technically I don't think the docs currently rule it out.
Is your thinking that that's guaranteed by this? (Edit: as far as I can tell, that section only guarantees size and alignment, but nothing about which bytes are initialized.) |
My thinking is just that this is "obviously" the case, but I don't know what exactly is stably documented where. It is guaranteed by the MiniRust representation relation, but that doesn't help you. It's hard to be precise in a spec without fully committing to all the details. |
Yeah, that makes sense. While we're on the subject, maybe you can clear something up for me. It seems inconsistent to say that we can view the bytes of a pointer (ie, |
My own understanding is that ptr2int transmutes are sound, but they strip provenance and so you can't transmute back to a pointer later and get a usable pointer. For simply comparing the int to 0 it should be sound to transmute (again, if my understanding is still up to date). |
Correct. Viewing the bytes of a pointer also does ptr2int transmute and is hence on equally uncharted ground. The t-opsem working consensus is what @Lokathor said, but so far we haven't felt ready to stably commit to that, and the lang team hasn't blessed this. |
Would it be easy to articulate what degrees of freedom there are in the design space that make this a not-yet-decided question? In other words, what could cause us to decide that ptr2int is UB in itself (rather than merely producing a pointer which is not particularly useful, and on which further operations are likely to be UB)? In zerocopy, we have a lot of consumers who want to be able to look at the bytes of a pointer, so being able to make progress on this would be great. I'd be happy to do some of the work to move it forward if the gaps are well-known. |
Well, |
Is there a possible world in which |
Is it guaranteed that |
Currently, pointers don't contain uninit bytes. I guess there is some possible future (eg: a new arch becomes popular many years from now) where pointers somehow contain an uninit byte. That seems unlikely, but if we want to worry about the absolute limits of possibility, I suppose it's possible. |
Yeah, that's exactly our concern. Our goal with zerocopy is to only rely on properties that we know won't be walked back in the future so we can credibly claim that "if your code is sound under Rust version X, it will be sound under all Rust versions Y > X." It means we end up being very pedantic about what is actually guaranteed 😛 |
Yes, that is very possible. It is, in my eyes, extremely unlikely that we will consider this transmute a ptr2int cast. ptr2int casts cannot be dead-code eliminated, and every pointer load is a potential transmutation site, and I am sure that we want to be able to remove dead loads. We might end up special-casing transmute, which would make transmute not equivalent to "just load through a differently-typed raw pointer", but I'd prefer to not do that. Currently I consider "ptr2int transmute is the same as |
No, that's not decided yet. If T contains bytes with provenance, we may also say that such a transmute is not allowed. |
That would be an unfortunate breaking change to a lot of existing code, but I suppose it's possible, true. |
Do we have a collection of such code? |
Not at hand. And "a lot" is probably overstating it. I've definitely seen people doing it before to inspect the bytes of an object, probably for the same reasons that the |
My guess is that this shows up primarily in places where you're communicating with another piece of code that shares access to a particular memory space. Think FFI, kernel/userland boundary, IPC with shared memory maps, etc. The most notable use case for zerocopy's users (that I'm aware of) is a userland process which emulates the Linux kernel. |
Even just a debug info display might read a pointer as bytes and show it. |
Co-authored with @jswrenn.
In zerocopy, we have a situation where we have a
*const Option<&T>
. We know that the referent bytes are "as initialized" as the bytes of anOption<&T>
, but not necessarily that they are a bit-validOption<&T>
. By "as initialized", we mean that one of the two is true:Option::<&T>::None
Some
, and its bytes are initialized wherever&T
's bytes are initializedWhat we need to do is check whether the referent contains all zeroed bytes. If it does, we can soundly treat those bytes as containing a
Option::<&T>::None
thanks to the NPO (which guarantees the layout of this specific value).Our problem is this: We're not sure whether it's sound to look at all of the bytes (in other words, to transmute from
Option<&T>
to[u8; size_of::<Option<&T>>()]
) in order to check that they're all zero. Another option we considered was round-tripping viaOption<NonNull<T>>
and then usingNonNull::addr
to extract the address.Any guidance on whether transmuting
Option<&T>
to either[u8; size_of::<Option<&T>>()]
or toOption<NonNull<T>>
are sound?The text was updated successfully, but these errors were encountered: