-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A possible path forward for padding bytes #32
Comments
As an alternative to introducing an explicit |
Careful: "manipulating" is and likely will be UB, in the sense that any operation other than copying, like The only thing that looks like it might change is that a copy of data at integer type (such as what happens when returning an integer from a function or assigning it) preserves uninitialized bytes rather than triggering UB when encountering one. (To be more precise, if any input byte is uninitialized, then the entire resulting integer is.) |
Mmm... that's weaker than I'd like. If I send an [u8] to a safe API like Write, I want that be safe no matter what said API's implementation does. Otherwise it's not really okay. So maybe we do need freeze() for UB-free abomonation. |
That's impossible. It is trivial to cause UB with LLVM by e.g. accessing an array with an uninitialized integer as an offset. (When you check that integer to be in-bounds and when you later use it to access the array, the value doesn't have to be consistent both times!) Exposing uninit data to safe untrusted code at integer type is not sound under any proposal I am aware of (and that even libstd recognized ;). TBH |
So, what else would you propose as a way forward for abomonation's implementation? |
If you need to expose uninit data to untrusted code, I propose you do a semver-breaking change to use a type (like |
Cannot be done if the data is to be sent somewhere via |
IIRC, @frankmcsherry uses Abomonation for maximally efficient network data transfers between multiple copies of a Rust program, MPI-style. In this scenario, replacing |
Oh I see, you want to write uninit data to a file? Interesting. Somehow that use-case didn't occur to me yet. Makes one wonder if C allows uninit data in a buffer used for I do not recall writes being brought up in the rust-lang/rust#42788 discussion. And it's not really the same thing. So, seems worth opening a new issue? I am not aware of one. This is a T-libs issue: right now, it's not possible to write generic code (abstracting over The interesting part is that for |
Hm okay, there is an alternative: add an |
In C, Note that I can personally think of other uses for freeze, including seqlocks and reads from untrusted memory, though both of these use cases would additionally require a spec amendment so that racey reads are "only" defined to return mem::uninitialized instead of being UB... |
In C the status of padding is basically not defined. Uninitialized memory turns into "indeterminate values" which are also awfully under-specified. So, I'd say in C the rule is whatever the compiler implements and the standard is of no help (as usual).
Seqlocks don't need freeze AFAIK, they just need "read-write races are not UB but return uninit". A seqlock never looks at the uninit data. Interaction with untrusted memory is specifically what I mean when I said "hack to work around other limitations"; but we are digressing. ;) |
More discussion on this topic is ongoing at https://internals.rust-lang.org/t/writing-down-binary-data-with-padding-bytes/11197/ . Currently, the consensus (or at least the subset of it which is useful for the abomonation use case) seems to be moving towards some kind of |
So, I've had a quick chat with @RalfJung about our padding bytes problem, and I think I now get a decent grasp of what we need in order to resolve that particular UB in abomonation.
Padding bytes are uninitialized memory, and we now have a safe way to model that in Rust, in the form of MaybeUninit. So we can take a first step towards handling them correctly today by casting
&[T]
into&[MaybeUninit<u8>]
instead of&[u8]
.This is enough to memcpy the bytes into another
&mut [MaybeUninit<u8>]
slice. But it's not yet enough to expose our unintialized bytes to the outside world, e.g. for the purpose of sending them toWrite
inencode()
andentomb()
, becauseWrite
wants initialized bytes, not possibly uninitialized ones.To resolve this, we need another language functionality, which is not available yet but frequently requested from the UCG: the
freeze()
operation, a tool which can turnMaybeUninit<u8>
into a nondeterministic validu8
value. You can think of it as a way to opt out of the UB of reading bad data and defer to hardware "whatever was lying around at that memory address" behavior.IIUC, something like that was proposed a long time ago, but it was initially rejected by security-conscious people on the ground that it could be used to observe the value of uninitialized memory coming from
malloc()
, which may leak sensitive information like cryptographic secrets which a process forgot to volatile-erase before callingfree()
.That precaution is commendable, but on the other hand, giving the growing body of evidence that an UB-free way to access specific regions of memory is needed for many use cases (from IPC with untrusted processes to implementation of certain low-overhead thread synchronization algorithms like seqlock), I'm hopeful that we're likely to get something like that in Rust eventually (and I will in fact take steps to make this discussion move forward once I'm done with my current UCG effort).
TL;DR: For now, this is blocked on a missing Rust feature, but the issue seems understood and is likely to be eventually resolved.
The text was updated successfully, but these errors were encountered: