Disarm mem::uninitialized by having it initialize to an arbitrary valid value for each type

For a while it has been understood that the `mem::uninitialized` API is broken. Originally the intuitive understanding of this API was that it produced a fixed, arbitrary value. However (as extensively discussed elsewhere) uninitialized memory is not a “fixed, arbitrary value”, and that for nearly all types in Rust it is instantaneous undefined behavior for them to be uninitialized.

What’s worse, even initialized values can be insta-UB. Rust uses its understanding of valid bit patterns to perform layout optimizations whereby invalid values can be repurposed as enum tags, which is how `Option<&T>` can be only a single word. Thus even `mem::uninitialized`’s sibling `mem::zeroed` is insta-UB when used with types like `&T`.

As a result, `mem::uninitialized` was deprecated and replaced with `mem::MaybeUninit`, which avoids the problems of the former. In addition, both `mem::zeroed` and `mem::uninitialized` were altered such that they will attempt to detect (and panic) when used on certain types: the former on types that must not be zero, and the latter on any types with invalid (defined) values.

However, implementing these panic checks caused a great deal of breakage (which arguably is desirable for safety, although still extremely disruptive), and to reduce disruption the check is conservative instead of exhaustive (https://github.com/rust-lang/rust/issues/66151). Unfortunately, while improving the coverage of these checks will still leave `mem::zeroed` as perfectly usable, `mem::uninitialized` will be rendered all but unusable, as essentially all types cannot ever be in an uninitialized state.

This is a problem for legacy crates that were never migrated away from `mem::uninitialized`. However, there is a solution that both allows these legacy crates to compile while also avoiding the problem of invalid uninitialized values: **`mem::uninitialized` can initialize with a valid value**. This may seem contrary to the original intent of the API, but consider that the only reason to avoid initialization is performance, and that the choice is now between “my code doesn’t compile”, “my code contains undefined behavior”, and “my code is slower”; the latter is the most desirable outcome of the three.   This raises the question: what value to initialize with? PR https://github.com/rust-lang/rust/pull/87032 proposed the simplest option, which was to replace the innards of `mem::uninitialized` with `mem::zeroed`, however zero is the value that is most often used for niche optimizations, so this would still reject a lot of code.

But there is a more desirable alternative. Because Rust understands what values are *invalid* for a type—it must, in order to perform niche optimizations—it therefore should also understand which values are valid for a type. An intrinsic could be added to the compiler which, given a type, produces an arbitrary valid value of that type. This intrinsic could be used within `mem::uninitialized`, and the existing panic check could be removed. This would allow all code in the wild still using `mem::uninitialized` to compile, and would also avoid all insta-UB related to validity invariants.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Disarm mem::uninitialized by having it initialize to an arbitrary valid value for each type #87675

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Disarm mem::uninitialized by having it initialize to an arbitrary valid value for each type #87675

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions