CHERI-valid pointer stuffing produces worse codegen than an implementation which does huge wrapping offsets

I came across this here: https://github.com/tokio-rs/bytes/pull/542 so I'm raising it more broadly because this seems like a portability hazard. [godbolt demo](https://godbolt.org/z/Kn3Mea4fv)

In a world where we can forget about provenance, one could set the lowest bit in a pointer with this:
```rust
pub fn old_style(a: *mut u8) -> *mut u8 {
    (a as usize | 1) as *mut u8
}
```
But of course we want to have a provenance model, including because we want to support architectures where pointer provenance is checked at runtime. So one might want to implement this function like so to be compatible with CHERI:
```rust
pub fn cheri_compat(a: *mut u8) -> *mut u8 {
    let old = a as usize;
    let new = old | 1;
    let diff = new.wrapping_sub(old);
    a.wrapping_add(diff)
}
```
But that version is slower. Instead of just `mov + or`, it gets compiled to `mov + not + and + and`. Which is very silly. We can get the original codegen back by writing this in a style which is almost certainly invalid on CHERI:
```rust
pub fn fast(a: *mut u8) -> *mut u8 {
    let old = a as usize;
    let new = old | 1;
    a.wrapping_sub(old).wrapping_add(new)
}
```
It doesn't make sense to me that users should have to choose between compatibility with CHERI and avoiding ptr-int-ptr casts while keeping a careful eye out for codegen regressions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CHERI-valid pointer stuffing produces worse codegen than an implementation which does huge wrapping offsets #96152

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CHERI-valid pointer stuffing produces worse codegen than an implementation which does huge wrapping offsets #96152

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions