-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
examples/export: Replace unsound to_padded_byte_vector()
implementation with bytemuck
.
#444
base: main
Are you sure you want to change the base?
Conversation
The CI failure seems to be in an unrelated import test. |
I'm surprised this is unsound but I trust your intuition. I'd like to understand better:
Why is this undefined behaviour? Padding bytes between
This is an interesting point. In C, I am used to expecting |
…tion with `bytemuck`. The function `to_padded_byte_vector()` is unsound because: * It accepts an arbitrary `Vec<T>` without checking that `T` contains no padding, which is UB to read in any way including by reinterpreting as `u8`s. * It produces a `Vec` which thinks it has a different alignment than the allocation was actually created with. To fix these problems, this change: * Uses `bytemuck` to check the no-padding condition. * Creates a new `Vec` instead of trying to reuse the existing one. (Conditional reuse would be possible, but more complex.) An alternative to `bytemuck` would be to make `to_padded_byte_vector()` an `unsafe fn` (or to accept `Vertex` only instead of `T`). However, I think it is valuable to demonstrate how to do this conversion using safe tools, to encourage people to use safe code instead of writing unsafe code without fully understanding the requirements.
Excerpted from https://doc.rust-lang.org/reference/behavior-considered-undefined.html:
Reasons why the design is such that this is undefined behavior include that the compiler is permitted to do things like
Both of these transformations are critical to being able to produce efficient code, and just these two applied together could mean that a value that was read from uninitialized memory appears to have an unstable, changing value, because it ended up being read multiple times from different pieces of uninitialized memory. Of course, it’s unlikely that any of these specific bad effects would apply to a simple bulk copy as this code is doing, but there isn’t and can’t be an exception in the rules of UB for “that would be silly”. Rigor is necessary to be able to have sound, highly optimizing compilers. It’s also permitted for the hardware to trap on reads of uninitialized memory (though in most cases today, this only happens at page granularity).
Separately from UB considerations, I disagree that they are acceptable. They may not affect the behavior of a glTF loader, but there are many other reasons to not to write them:
It’s fine to include guaranteed zero bytes, of course, but not uninitialized bytes.
Indeed that is a difference, but the key difference here is that Rust deallocation has to be given the layout information and the allocator may rely on it being accurate, whereas
This is true, but not a good way to think about writing unsafe code. When you call an unsafe function, you must obey all of the safety requirements from that specific function’s documentation. Those in
This rule was violated, so no further analysis is required to conclude that the call is unsound. |
The function
to_padded_byte_vector()
is unsound because:Vec<T>
without checking thatT
contains no padding, which is UB to read in any way including by reinterpreting asu8
s.Vec
which thinks it has a different alignment than the allocation was actually created with.To fix these problems, this change:
bytemuck
to check the no-padding condition.Vec
instead of trying to reuse the existing one. (Conditional reuse would be possible, but more complex.)An alternative to using
bytemuck
would be to maketo_padded_byte_vector()
anunsafe fn
(or to acceptVertex
only instead ofT
). However, I think it is valuable to demonstrate how to do this conversion using safe tools, to encourage people to use safe code instead of writing unsafe code without fully understanding the requirements.