Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about: aliasing requirements for nested references? #532

Open
oxalica opened this issue Sep 28, 2024 · 5 comments
Open

What about: aliasing requirements for nested references? #532

oxalica opened this issue Sep 28, 2024 · 5 comments
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows)

Comments

@oxalica
Copy link

oxalica commented Sep 28, 2024

Ref:

This is already a fact under {Stacked,Tree}Borrow, though I think it's unfortunate to forbids optimizations on structs containing reference fields. Specifically, optimizing out duplicated-read through double-reference after a function call and caching an (known to be unchanged) pointer indirection are not possible.

fn main() {
    let mut s = 0u8;
    let borrow = &raw mut s;
    unsafe {
        let const_borrow: *const *mut u8 = &raw const borrow;
        let x: &&u8 = &*const_borrow.cast::<&u8>();

        // This can be in another innocent fn, but inline to avoid `static mut`.
        {
            let y1 = **x;
            *borrow = 1; // modify, maybe through some user-provided fn.
            let y2 = **x;
            assert!(y1 != y2);
        }

        // An alternative with caching. But compiler is not allowed to
        // transform the code above into this.
        {
            let y: &u8 = *x;
            let y1 = *y;
            *borrow = 2; // modify
            let y2 = *y; // <- Undefined Behavior
            assert!(y1 != y2);
        }
    }
}

Here we can notice that even through x and *x is both unchanged and u8: Freeze, **x can change via a foreign pointer with write permission. Thus caching one level of indirection *x is not possible because it would be Disabled by that foreign write.

Could we do better on this? For this example, maybe we could leave protectors on the access trace of a Frozen reference argument (when without interior mutability, or course)?

@RalfJung RalfJung changed the title {Stacked,Tree}Borrow does not allow caching immutable references thus forbids some GVN optimization What about: aliasing requirements for nested references? Sep 29, 2024
@RalfJung
Copy link
Member

RalfJung commented Sep 29, 2024

This post is very hard to understand and quite vague about the actual problem. Lucky enough I saw rust-lang/miri#3921 (comment) before otherwise I would have been very confused...

though I think it's unfortunate to forbids optimizations on structs containing reference fields.

We don't forbid optimizations on structs containing reference fields. If you have a function that takes (&u8, &u8) as argument (a tuple/struct with reference fields), those can be fully optimized.

I think what you are talking about is: SB and TB decide is to not impose any restrictions on nested references. That does mean we lose some optimizations around nested references. (Please update the issue description to explain what exactly you mean.)

So far, there has been no proposal for how a memory model that enforces aliasing on nested references could work. It can't be anything like Stacked Borrows or Tree Borrows, as those models are based on "retagging" where when we want to distinguish pointers to impose different aliasing rules on them, we generate a fresh unique "tag" so that we can tell them apart. If a function takes &u32, it generates a fresh tag for this reference and imposes restrictions on this tag to enforce the aliasing requirements. However, if we are given x: &&u32, then we cannot retag the inner reference (the one stored at *x), since that would mean writing that tag to *x, which would be UB as we can't write through a shared reference.

We'd need a completely different approach. It would likely have to involve making it so that when you load from *x, we establish some sort of relation of the pointer you loaded with the pointer through which it got loaded, or so? I have no idea if anything like this could work. But one thing is sure, this is on the scale of "throw out everything we have and start from scratch"; I am fairly sure that no small change to Stacked Borrows or Tree Borrows can provide such optimizations.

@workingjubilee
Copy link
Member

idiomatic usage of Rust's existing standard library API surface, in particular most of the implementation of Iterator and functions like slice::iter to produce such iterators over references, very commonly generate nested references. and "why doesn't a double reference optimize as well as a single reference" is a question repeatedly raised on the Rust issue tracker.

@workingjubilee
Copy link
Member

though mostly, I'm somewhat perplexed why "retags" are mutation, exactly? my impression was that they are de jure creation of a new reference with the tags in question.

@RalfJung
Copy link
Member

Retags change the provenance of a pointer. In the Abstract Machine, provenance is real data that affects whether code has UB, so changing it must count as mutation. After all, we want to do optimizations like

fn foo(x: &*const T) {
  let ptr = *x;
  // ...
  let ptr2 = *x;
}

and know that the two pointers are the same, in the sense that we can replace one by the other -- which is only the case if they are fully equal, including their provenance.

@RalfJung RalfJung added the A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) label Oct 18, 2024
@workingjubilee
Copy link
Member

workingjubilee commented Oct 18, 2024

To motivate this a bit more and expand on what I said, I will lift the example I mistakenly posted in #412.

Code appears that would resemble the pattern that is, if I understand everything correctly, subject to a miscompiled like rust-lang/rust#130853 if it was written directly:

let iter = vec.iter().filter(predicate);
call();
for item in iter.filter(overlapping_predicate) {
    // code
}

This is because the initial code that is written is actually a library exposing something like this:

impl SomeType<T> {
    pub fn iter(&self) -> impl Iterator<Item = &T> {
         self.inner_slice.iter().filter(valid_elements_predicate)
    }
}

Because Self::Item may be by-value, filter takes &Self::Item, which, for an iterator over &T, means that the predicate function is called on &&T. If we filter multiple times... likely in the case that one is done without knowledge of the other!... then some predicates may overlap and the optimization becomes desirable, no? I certainly do not think this code is particularly unthinkable, though we might have to flex our imagination to justify the opt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows)
Projects
None yet
Development

No branches or pull requests

3 participants