Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable noalias annotations #54878

Closed
bstrie opened this issue Oct 6, 2018 · 57 comments
Closed

Enable noalias annotations #54878

bstrie opened this issue Oct 6, 2018 · 57 comments
Assignees
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@bstrie
Copy link
Contributor

bstrie commented Oct 6, 2018

This issue tracks the undoing of the -Zmutable-noalias=no default introduced in #54639 on account of a bug in LLVM. cc @nagisa

(Deja vu?)

@bstrie bstrie added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. C-enhancement Category: An issue proposing an enhancement or a PR with one. A-codegen Area: Code generation labels Oct 6, 2018
@nagisa
Copy link
Member

nagisa commented Oct 7, 2018

I’m still working on figuring out the underlying issue. The interesting ticket is #54462.

@nagisa nagisa self-assigned this Oct 7, 2018
@Aaron1011
Copy link
Member

Aaron1011 commented Oct 13, 2018

Using @nagisa's minimal reproduction:

Minimised test case with no unsafe code (make sure to compile with 1 codegen unit!):
fn linidx(row: usize, col: usize) -> usize {
    row * 1 + col * 3
}

fn swappy() -> [f32; 12] {
    let mut mat = [1.0f32, 5.0, 9.0, 2.0, 6.0, 10.0, 3.0, 7.0, 11.0, 4.0, 8.0, 12.0];

    for i in 0..2 {
        for j in i+1..3 {
            if mat[linidx(j, 3)] > mat[linidx(i, 3)] {
                    for k in 0..4 {
                            let (x, rest) = mat.split_at_mut(linidx(i, k) + 1);
                            let a = x.last_mut().unwrap();
                            let b = rest.get_mut(linidx(j, k) - linidx(i, k) - 1).unwrap();
                            ::std::mem::swap(a, b);
                    }
            }
        }
    }

    mat
}

fn main() {
    let mat = swappy();
    assert_eq!([9.0, 5.0, 1.0, 10.0, 6.0, 2.0, 11.0, 7.0, 3.0, 12.0, 8.0, 4.0], mat);
}

I was able to bisect LLVM's optimization passes to find the one causing the error.

Running this command results in a working executeable (replace bug.rs with the name of the file you saved the reproduction in).

rustc -Z no-parallel-llvm -C codegen-units=1 -O -Z mutable-noalias=yes -C llvm-args=-opt-bisect-limit=2260 bug.rs

While running this command results in a broken executable (the `assert_eq`` fails):

rustc -Z no-parallel-llvm -C codegen-units=1 -O -Z mutable-noalias=yes -C llvm-args=-opt-bisect-limit=2261 bug.rs

LLVM bisect output

For this file, optimization 2261 corresponds to Global Value Numbering on function (_ZN3bug6swappy17hdcc51d0e284ea38bE)

@comex
Copy link
Contributor

comex commented Oct 13, 2018

Bisecting LLVM revisions (using llvmlab bisect) narrows it down to r305936-r305938, presumably r305938:

[BasicAA] Use MayAlias instead of PartialAlias for fallback.

Note that this is a pretty old change, from June 2017.

Edit: Looking at the commit description, it seems likely that the bug existed prior to that, but was masked by BasicAA preventing later alias passes from running, which is what the commit fixed. The case involves checking aliasing between a pair of getelementptr instructions where the compiler knows they have the same base address but doesn't know the offsets.

Edit2: Also, passing -enable-scoped-noalias=false as an LLVM option prevents the miscompilation. (This is not surprising since that disables noalias handling altogether, but just in case it helps…)

@nikic
Copy link
Contributor

nikic commented Oct 13, 2018

From a look at the pre-GVN IR, I feel like the root cause here might be in loop unrolling, depending on whether my understanding of how LLVM aliasing annotations work is correct.

Consider a code like

int *a, *b;
for (int i = 0; i < 4; i++) {
    a[i & 1] = b[i & 1];
}

where a[i & 1] and b[i & 1] do not alias within a single iteration, but a and b in general may alias.

In LLVM IR this would go something like:

define void @test(i32* %addr1, i32* %addr2) {
start:
    br label %body

body:
    %i = phi i32 [ 0, %start ], [ %i2, %body ]
    %j = and i32 %i, 1
    %addr1i = getelementptr inbounds i32, i32* %addr1, i32 %j
    %addr2i = getelementptr inbounds i32, i32* %addr2, i32 %j

    %x = load i32, i32* %addr1i, !alias.scope !2
    store i32 %x, i32* %addr2i, !noalias !2

    %i2 = add i32 %i, 1
    %cmp = icmp slt i32 %i2, 4
    br i1 %cmp, label %body, label %end

end:
    ret void
}

!0 = !{!0}
!1 = !{!1, !0}
!2 = !{!1}

If we run this through -loop-unroll we get:

define void @test(i32* %addr1, i32* %addr2) {
start:
  br label %body

body:                                             ; preds = %start
  %x = load i32, i32* %addr1, !alias.scope !0
  store i32 %x, i32* %addr2, !noalias !0
  %addr1i.1 = getelementptr inbounds i32, i32* %addr1, i32 1
  %addr2i.1 = getelementptr inbounds i32, i32* %addr2, i32 1
  %x.1 = load i32, i32* %addr1i.1, !alias.scope !0
  store i32 %x.1, i32* %addr2i.1, !noalias !0
  %x.2 = load i32, i32* %addr1, !alias.scope !0
  store i32 %x.2, i32* %addr2, !noalias !0
  %addr1i.3 = getelementptr inbounds i32, i32* %addr1, i32 1
  %addr2i.3 = getelementptr inbounds i32, i32* %addr2, i32 1
  %x.3 = load i32, i32* %addr1i.3, !alias.scope !0
  store i32 %x.3, i32* %addr2i.3, !noalias !0
  ret void
}

!0 = !{!1}
!1 = distinct !{!1, !2}
!2 = distinct !{!2}

Note how all four copies of the loop use aliasing metadata over the same aliasing domain. Instead of being noalias within a single iteration, it's noalias across the whole function.

Finally, -scoped-noalias -gvn gives us:

define void @test(i32* %addr1, i32* %addr2) {
start:
  %x = load i32, i32* %addr1, !alias.scope !0
  store i32 %x, i32* %addr2, !noalias !0
  %addr1i.1 = getelementptr inbounds i32, i32* %addr1, i32 1
  %addr2i.1 = getelementptr inbounds i32, i32* %addr2, i32 1
  %x.1 = load i32, i32* %addr1i.1, !alias.scope !0
  store i32 %x.1, i32* %addr2i.1, !noalias !0
  store i32 %x, i32* %addr2, !noalias !0
  store i32 %x.1, i32* %addr2i.1, !noalias !0
  ret void
}

!0 = !{!1}
!1 = distinct !{!1, !2}
!2 = distinct !{!2}

And this will result in incorrect results if a = b + 1.

It's possible to reproduce this issue from C with the following code:

#include "stdio.h"

void copy(int * restrict to, int * restrict from) {
	*to = *from;
}

void test(int *a, int *b) {
	for (int i = 0; i < 4; i++) {
		copy(&b[i & 1], &a[i & 1]);
	}
}

int main() {
	int ary[] = {0, 1, 2};
	test(&ary[1], &ary[0]);
	printf("%d %d %d\n", ary[0], ary[1], ary[2]);
	return 1;
}

With Clang 6.0 this prints 2 2 2 at -O0 and 1 2 2 at -O3. I'm not sure if this code is legal under restrict semantics in C, but I think it should be legal under the stricter noalias semantics of LLVM.

@comex
Copy link
Contributor

comex commented Oct 13, 2018

I reduced it to a simple C test case (compile at -O3 and -O0 and compare output):

#include <stdlib.h>
#include <stdio.h>
#include <assert.h>

__attribute__((always_inline))
static inline void copy(int *restrict a, int *restrict b) {
    assert(a != b);
    *b = *a;
    *a = 7;
}

__attribute__((noinline))
void floppy(int mat[static 2], size_t idxs[static 3]) {
    for (int i = 0; i < 3; i++) {
        copy(&mat[i%2], &mat[idxs[i]]);
    }
}

int main() {
    int mat[3] = {10, 20};
    size_t idxs[3] = {1, 0, 1};
    floppy(mat, idxs);
    printf("%d %d\n", mat[0], mat[1]);
}

Note that if you remove restrict, the C equivalent of noalias, behavior is correct. Yet even then, the assert(a != b) passes, proving that no UB can occur due to calling it with restrict.

What's happening is:

  1. copy() gets inlined, resulting in something like:
for (int i = 0; i < 3; i++) {
    mat[idxs[i]] = mat[i%2]; mat[i%2] = 7;
}
  1. LLVM unrolls the loop:
mat[idxs[0]] = mat[0]; mat[0] = 7; /* from copy(&mat[0%2], &mat[idxs[0]]) */
mat[idxs[1]] = mat[1]; mat[1] = 7; /* from copy(&mat[1%2], &mat[idxs[1]]) */
mat[idxs[2]] = mat[0]; mat[0] = 7; /* from copy(&mat[2%2], &mat[idxs[2]]) */
  1. LLVM thinks mat[0] cannot alias with mat[idxs[1]] or mat[1], ergo it cannot have been changed between mat[0] = 7; and mat[idxs[2]] = mat[0];, ergo it's safe for global value numbering to optimize the latter to mat[idxs[2]] = 7;.

But mat[0] does alias with mat[idxs[1]], because idxs[1] == 0. And we did not promise it wouldn't, because on the second iteration when &mat[idxs[1]] is passed to copy, the other argument is &mat[1]. So why does LLVM think it can't?

Well, it has to do with the way copy is inlined. The noalias function attribute is turned into !alias.scope and !noalias metadata on the load and store instructions, like:

  %8 = load i32, i32* %0, align 4, !tbaa !8, !alias.scope !10, !noalias !13
  store i32 %8, i32* %7, align 4, !tbaa !8, !alias.scope !13, !noalias !10
  store i32 7, i32* %0, align 4, !tbaa !8, !alias.scope !10, !noalias !13

Normally, if a function is inlined multiple times, each copy gets its own unique IDs for alias.scope and noalias, indicating that each call represents its own 'inequality' relationship* between the pair of arguments marked noalias (restrict at C level), which may have different values for each call.

However, in this case, first the function is inlined into the loop, then the inlined code is duplicated when the loop is unrolled – and this duplication does not change the IDs. Because of this, LLVM thinks none of the a's can alias with any of the b's, which is false, because a from the first and third calls aliases with b from the second call (all pointing to &mat[0]).

Amazingly, GCC also miscompiles this, with different output. (clang and GCC at -O0 both output 7 10; clang at -O3 outputs 7 7; GCC at -O3 outputs 10 7.) Uh, I really hope I didn't screw something up and add UB after all, but I don't see how...

* It's a bit more complicated than that, but in this case, since copy does not use any pointer arithmetic and writes to both pointers, the inequality a != b is necessary and sufficient for a call not to be UB.

@comex
Copy link
Contributor

comex commented Oct 13, 2018

Heh, looks like I raced with @nikic to find the same explanation. Their test case is slightly nicer :)

@nikic
Copy link
Contributor

nikic commented Oct 13, 2018

That's some really great timing ^^ We reached the same conclusion with nearly the same reduced test case at the same time :)

To fix this, probably something along the lines of https://github.com/llvm-mirror/llvm/blob/54d4881c352796b18bfe7314662a294754e3a752/lib/Transforms/Utils/InlineFunction.cpp#L801 needs to be also be done in LoopUnrollPass.

@nikic
Copy link
Contributor

nikic commented Oct 13, 2018

I've submitted an LLVM bug report for this issue at https://bugs.llvm.org/show_bug.cgi?id=39282.

@comex
Copy link
Contributor

comex commented Oct 14, 2018

And – just mentioning this for completeness – I submitted a bug report to GCC since it also miscompiled my C test case: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87609

@Phlosioneer
Copy link
Contributor

Triage: If I'm reading this correctly, the LLVM fix was accepted (https://reviews.llvm.org/D9375). I'm unsure what that means for actually merging into LLVM, or when that happened; someone more knowledgeable about LLVM's revision process should check if the issue is fixed now (and for what versions).

@mati865
Copy link
Contributor

mati865 commented Jul 31, 2019

It's not merged, the review process was a bit odd and the person who OK'ed it don't review patches anymore.

@Mark-Simulacrum Mark-Simulacrum added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC and removed C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Jul 31, 2019
@cynecx
Copy link
Contributor

cynecx commented Oct 7, 2019

https://reviews.llvm.org/D68484

@jrmuizel
Copy link
Contributor

There's been a call for testing with the "full restrict" patch set. It would probably be valuable if someone tried out re-enabling noalias in Rust on top of an llvm that has this patch set applied.

@jrmuizel
Copy link
Contributor

A fix for the LLVM bug has now landed.

@lunacookies
Copy link
Contributor

Do we know yet what kind of performance improvement is likely to result from this?

@bstrie
Copy link
Contributor Author

bstrie commented Jan 25, 2021

@arzg When noalias annotations were first disabled in 2015 it resulted in between 0-5% increased runtime in various benchmarks. A lot can change in five years, but that seems like a decent ballpark estimate of what sort of performance might result from turning this back on. (Note that because of the nature of this brand of optimization, certain hot loops could actually benefit much more than this.)

@matu3ba
Copy link

matu3ba commented Jan 26, 2021

@jrmuizel Could you link the announcement? I dont see anything on the review board of the LLVM patch. I would also expect miscompilations due to the nasty pointer provenance.

@jrmuizel
Copy link
Contributor

@matu3ba the patch that fixed it is here https://reviews.llvm.org/D92887

@Muximize
Copy link

Muximize commented Jan 27, 2021

Now that LLVM 12 has branched, can we test and benchmark this? Or do we have to wait for its release?

@bstrie
Copy link
Contributor Author

bstrie commented Jan 27, 2021

@Muximize I tried compiling rustc with the LLVM master branch yesterday and ran into compilation errors. rustc will need to add support for the newer version of LLVM before we can benchmark anything. The fact that LLVM 12 has branched doesn't really mean all that much; rustc routinely pins its dependency to arbitrary points on the LLVM master branch whenever it desires some new LLVM feature: see https://github.com/rust-lang/llvm-project

@nikic
Copy link
Contributor

nikic commented Jan 27, 2021

The upgrade to LLVM 12 is currently in progress, but it will take a while until all regressions are sorted out. Currently it's not possible to build stage2 due to https://bugs.llvm.org/show_bug.cgi?id=48888.

@mzji
Copy link

mzji commented Jan 28, 2021

The upgrade to LLVM 12 is currently in progress, but it will take a while until all regressions are sorted out. Currently it's not possible to build stage2 due to https://bugs.llvm.org/show_bug.cgi?id=48888.

Seems like the issue has already been fixed.

@VaibhavDS19
Copy link

Is there any option to turn on the mtuable-noalias using cargo? I tried cargo +nightly build --release -Z mutable-noalias=yes and it said no such option.

@bjorn3
Copy link
Member

bjorn3 commented Feb 16, 2021

Try RUSTFLAGS="-Z mutable-noalias=yes" cargo +nightly build --release.

@VaibhavDS19
Copy link

Now it worked, thank you for the quick response.

@jrmuizel
Copy link
Contributor

jrmuizel commented Mar 1, 2021

The LLVM update is happening in #81451

@bryanhitc
Copy link

It seems like #63818 needs to be resolved before this can be enabled by default.

@bstrie bstrie changed the title Re-enable noalias annotations by default once LLVM no longer miscompiles them Enable noalias annotations Mar 6, 2021
@cbeuw
Copy link
Contributor

cbeuw commented Mar 6, 2021

As #63818 is unlikely to be solved soon, could we at least re-enable noalias on structures that cannot possibly be self-referential? That is, structs which do not hold potentially mutable references (&mut and variants of Cell with interior mutability) to itself in the transitive closure of its "type dependency" graph, after monomorphisation. This should be beneficial to a lot of code. From my reading, the unsoundness of self-referentials do not affect Boxed fields so that should extend the applicability quite a bit - I may be wrong though.

Not knowing much about the compiler internals, I'm not sure either about how complicated this (hopefully temporary) solution would be. Finding cycles in a graph will have compile time impacts, but we could restrict it to --release builds.

@felix91gr
Copy link
Contributor

@cbeuw I think you can do the transitive closure computation if you're restricted to safe Rust, but I have the feeling it might be undecidable in unsafe Rust.

My thinking is: since you could have an arbitrary pointer somewhere, you could have a pointer that's part of the transitive closure and which happens to sometiemes point to the struct itself. And to answer whether or not that is the case if given arbitrary unsafe code as the base seems to me like it should be undecidable.
On the other hand, would that code be considered to be sound? I don't remember if having a self-referential loop through unsafe code was undefined behavior or not.

@RalfJung
Copy link
Member

RalfJung commented Mar 6, 2021

Sound self-referential types must be !Unpin, right? So for &mut impl Unpin, we could still safely emit that attribute?

@bjorn3
Copy link
Member

bjorn3 commented Mar 6, 2021

Self-referential types may be hidden. Instead a crate could export an opaque type containing a box of the self-referential type. In this case it is not necessary for soundness for the self-referential type to be !Unpin.

@RalfJung
Copy link
Member

RalfJung commented Mar 6, 2021

Fair. But all this is a hack anyway, so it might still be a reasonable trade-off.

bors added a commit to rust-lang-ci/rust that referenced this issue Mar 21, 2021
Enable mutable noalias for LLVM >= 12

Enable mutable noalias by default on LLVM 12, as previously known miscompiles have been resolved. Now it's time to find the next one ;)

 * The `-Z mutable-noalias` option no longer has an explicit default and accepts `-Z mutable-noalias=yes` and `-Z mutable-noalias=no` to override the LLVM version based default behavior.
 * The decision on whether to apply the noalias attribute is moved into rustc_codegen_llvm. rustc_middle only provides us with the necessary information to make the decision.
 * `noalias` is not emitted for types that are `!Unpin`, as a heuristic for self-referential structures (see rust-lang#54878 and rust-lang#63818).
@nikic
Copy link
Contributor

nikic commented Mar 22, 2021

Noalias annotations have been re-enabled in #82834. Place your bets on when it will get reverted ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests