Improve ptr::read code in debug builds

In https://github.com/rust-lang/rust/pull/80290, some people raised concerns about the quality of the code that `ptr::write` compiles to in debug builds. Given that, to my knowledge, reads are much more common than writes, I would think that one should be much more concerned with the code that `ptr::read` compiles to -- and currently, there's going to be quite a few function calls in there, so without inlining, that code will be pretty slow.

`ptr::read` could be improved with techniques similar to what I did for `ptr::write` (call intrinsics directly, and inline everything else by hand). This would result in (something like) the following implementation: (EDIT see below for why this is wrong)
```rust
pub const unsafe fn read<T>(src: *const T) -> T {
    // We are calling the intrinsics directly to avoid function calls in the generated code
    // as `intrinsics::copy_nonoverlapping` is a wrapper function.
    extern "rust-intrinsic" {
        fn copy_nonoverlapping<T>(src: *const T, dst: *mut T, count: usize);
    }

    // For the same reason, we also side-step `mem::MaybeUninit` and use a custom `union` directly.
    #[repr(C)]
    union MaybeUninit<T> {
        init: mem::ManuallyDrop<T>,
        uninit: (),
    }

    let mut tmp: MaybeUninit<T> = MaybeUninit { uninit: () };
    // SAFETY: the caller must guarantee that `src` is valid for reads.
    // `src` cannot overlap `tmp` because `tmp` was just allocated on
    // the stack as a separate allocated object.
    //
    // `MaybeUninit` is repr(C), so we can assume `init` is at offset 0, justifying the pointer
    // casts.
    //
    // Finally, since we just wrote a valid value into `tmp`, it is guaranteed
    // to be properly initialized.
    unsafe {
        copy_nonoverlapping(src, &mut tmp as *mut _ as *mut T, 1);
        mem::transmute_copy(&tmp.init)
    }
}
```
However, here we have the extra difficulty that `read` is (unstably) a `const fn`, so the above implementation is rejected. `&tmp.init` can be replaced by `&mut tmp.init` and that works (or we wait for a bootstrap bump so we can make use of https://github.com/rust-lang/rust/pull/80418), but `transmute_copy` is non-`const`, so there's still more work to be done. (`transmute` does not work since the compiler does not recognize that `T` and `ManuallyDrop<T>` have the same size.)

I will stop here, but if someone else strongly cares about `ptr::read` performance/codesize in debug builds, feel free to pick this up and drive it to completion.

Cc @bjorn3 @therealprof @usbalbin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve ptr::read code in debug builds #81163

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve ptr::read code in debug builds #81163

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions