What about: Pointer-to-integer transmutes?

Transmuting pointers to integers (i.e., not going through the regular cast) is a problem. This is demonstrated by the following silly example:
```rust
fn example(ptr: *const i32, cmp: usize) -> usize { unsafe {
  let mut storage: usize = 0;
  *(&mut storage as *mut _ as *mut *const i32) = ptr; // write at ptr type
  let val = storage; // read at int type (0)
  storage = val; // redundant write back (1)
  external_function(&storage); // just making sure the value in `storage` can be observed
  if val == cmp {
    return cmp; // could exploit integer equivalence (2)
  }
  return 0;
} }
```
Imagine executing this code on the Abstract Machine, taking into account that pointers have provenance, i.e., a ptr-to-int conversion loses information. Now what happens at point (0)? Here we read the data stored in `storage` at type `usize`. That data however is the ptr `ptr`, i.e., it has provenance. What should happen with that provenance at (0)?
1. We could drop the provenance. That would basically mean that the load of `storage` acts like an implicit ptr-to-int cast. The problem with this approach is that we cannot remove the redundant write at (1): the value in `val` is *different* from what is stored in `storage`, since `val` has no provenance but the `ptr` stored in `storage` does! This is basically another version of https://bugs.llvm.org/show_bug.cgi?id=34548: ptr-to-int casts are *not* NOPs, and a ptr-int-ptr roundtrip cannot be optimized away. If a load, like at (0), can perform a ptr-to-int cast, now the same concerns apply here.
2. We could preserve the provenance. Then, however, we end up with `val` having type `usize` and *also* having provenance, which is a big problem: the compiler might decide, at program point (2), to `return val` instead of `return cmp` (based on the fact that `val == cmp`), but if `val` could have provenance then this transformation is wrong! This is basically the isue at the heart of [my blog post on provenance](https://www.ralfj.de/blog/2020/12/14/provenance.html): `==` ignores provenance, so just because two values are equal according to `==` does not mean they can be used interchangeably in all circumstances.
3. What other option is there? Well, we might make the load return `poison` -- effectively declaring ptr-to-int transmutes as UB.

The last option is what is being [proposed to LLVM](https://lists.llvm.org/pipermail/llvm-dev/2021-June/150883.html), along with a new "byte" type such that loading at type `bN` would preserve provenance, but loading at type `iN` would turn bytes with provenance into `poison`. On the flipside, no arithmetic or logical operations are possible on `bN`; that type represents "opaque bytes" with the only possible operations being load and store (and explicit casts to remove any provenance that might exist). This leads to a consistent model in which both redundant store elimination and GVN substitution on integer types (the optimizations mentioned above) are possible. I don't know any other way to resolve the [contradiction that otherwise arises from doing both of these optimizations](https://github.com/rust-lang/unsafe-code-guidelines/issues/286#issuecomment-860189806). However, the LLVM discussion is still in its early stages, and there were already a lot of responses that I have not read in detail yet. *If* this ends up being accepted, we on the Rust side will have to figure out if and how we can make use of the new "byte" type and its explicit casts (to pointers or integers).

This thread is about discussing how we need to restrict ptr-to-int transmutes when pointers have provenance but integers do not. See https://github.com/rust-lang/unsafe-code-guidelines/issues/287 for a discussion with the goal of avoiding provenance in the first place.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What about: Pointer-to-integer transmutes? #286

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What about: Pointer-to-integer transmutes? #286

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions