Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed opportunities to eliminate bounds checks #35981

Closed
StefanoD opened this issue Aug 25, 2016 · 11 comments
Closed

Missed opportunities to eliminate bounds checks #35981

StefanoD opened this issue Aug 25, 2016 · 11 comments
Labels
A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@StefanoD
Copy link

A user reported a code example that shows that in Go 1.7 there is bounds check elimination and in rust not.

https://users.rust-lang.org/t/bounds-check-elimination-in-go-1-7/7008

@jrmuizel
Copy link
Contributor

A reduced test case would help here.

@StefanoD
Copy link
Author

The thread has a simple test case. Is this not sufficient? Maybe the author of the thread could write something here. I linked this bug report in the thread.

@jrmuizel
Copy link
Contributor

It's not clear from the thread exactly which bounds checks are not being eliminated that are eliminated by Go. A reduced test case would ideally point out a single array access that has a bounds check in Rust but not in Go.

@brson
Copy link
Contributor

brson commented Aug 26, 2016

There are many examples in the linked thread, and a quick read doesn't make it obvious where there are bounds checks to eliminate. Clearly there are optimization opportunities here, but this thread needs reduced examples with a clear description of which bounds checks are not being removed.

@brson brson added I-slow Issue: Problems and improvements with respect to performance of generated code. A-codegen Area: Code generation T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. labels Aug 26, 2016
@brson brson changed the title Bounds Check Elimination Missed opportunities to eliminate bounds checks Aug 26, 2016
@Deewiant
Copy link
Contributor

Deewiant commented Sep 16, 2016

Disclaimer: I know very little Rust and even less Go. I'm somewhat familiar with LLVM, though, so I took a look, comparing the original Go results at http://www.tapirgames.com/blog/golang-1.7-bce to the Rust as given in the forum post.

I'm using rustc 1.11.0 and LLVM 3.8.1.

Note that the Rust translations do not seem to be quite equivalent: the Go code seems to perform no-op accesses on array slices whereas the Rust versions increment the array values. This complicates the generated code heavily, so to simplify the analysis I replaced increment ops like s[i] += 1 with no-op accesses like s[i]. This did not seem to have an effect on the amount of bounds checks in the generated code but made my life easier.

I'll go through the functions in order.

f1, f2, f3, f4

rustc -C opt-level=3 eliminates the same bounds checks as Go does, as is to be expected for these simple cases.

f5

Here's the discrepancy. The Go code:

func f5(s []int) {
    for i := range s {
        _ = s[i]
        _ = s[i:len(s)]
        _ = s[:i+1]
    }
}

The Rust translation from the forum post, with increment operations removed and usize changed to isize to be pedantic:

fn f5(s: &mut [isize]) {
    for i in 0 .. s.len() {
        s[i];
        for j in i .. s.len() { s[j]; }
        for j in 0 .. i + 1 { s[j]; }
    }
}

Looking at the output of rustc -C opt-level=3 --emit llvm-ir, the first access and the first inner loop are completely optimized out here, but the second inner loop remains, and there is a bounds check on the s[j] access therein.

Constructing what I believe to be equivalent C we can see whether this is a rustc or LLVM problem:

#include <stdlib.h>
void f5(long* s, size_t len) {
    for (size_t i = 0; i < len; i++) {
        for (size_t j = 0; j <= i + 1; j++) {
            if (j >= len) {
                abort();
            }
        }
    }
}

The LLVM IR output by clang -O3 -S -emit-llvm is practically identical, so we can't blame rustc here: LLVM doesn't realize that j >= len will never be true here.

Going back to the original Go, we see that it's constructing array slices instead of doing individual element access like the Rust translation. So we could write a Rust version of the function that loops over slices instead:

fn f5(s: &mut [isize]) {
    for i in 0 .. s.len() {
        s[i];
        let len = s.len();
        for x in s[i .. len].iter_mut() { *x; }
        for x in s[0 .. i + 1].iter_mut() { *x; }
    }
}

No bounds checks are emitted for the above code, but there's still a no-op loop that is not eliminated. I'm not sure why that happens but I'm sure it's tied with the bounds check issues.

Going even closer to the original Go:

fn f5(s: &mut [isize]) {
    for i in 0 .. s.len() {
        s[i];
        let len = s.len();
        &s[i .. len];
        &s[0 .. i + 1];
    }
}

The bounds check is back: LLVM doesn't realize that i + 1 > len cannot be true.

f6

Codegen is practically identical to f5, no further insights available here.

f7, f8, f9

Same results as Go, no problems here.

Conclusions

I made some incorrect conclusions here, please see my comment below for better ones. LLVM bug 810 is not strongly related to this and my examples aren't entirely relevant.

In the end I think this boils down to the now over ten years old LLVM bug 810. (There may be more specific bugs for these kinds of use cases related to array bounds checks but I didn't find any.) LLVM simply doesn't have the smarts required to perform reasoning like "if i < n, then we know that !(i + 1 > n)". The following C function, which LLVM 3.8.1 can't optimize to a no-op, demonstrates this fairly succintly (doing the subtraction with a signed int is required so that LLVM can assume that it does not overflow — I believe that's how Rust's integers work as well?):

#include <stdlib.h>
void check(unsigned int i, unsigned int len) {
    if (i < len) {
        if (i > (int)len - 1) {
            abort();
        }
    }
}

Note that if the inner condition is changed to the equivalent i >= len, LLVM does know that it's false, presumably because it recognizes it as the negation of i < len. This suggests that if LLVM were to canonicalize i > x - 1 and i + 1 > x as i >= x this particular case would be solved. But this more obtuse example would still have a problem:

fn obtuse(s: &mut [isize]) {
    if s.len() >= 2 {
        for i in 0 .. s.len() {
            if i < s.len() - 2 {
                &s[0 .. i + 2];
            }
        }
    }
}

Just like this C code does:

#include <stdlib.h>
void check(unsigned int i, unsigned int len) {
    if (len >= 2) {
        if (i < len - 2) {
            if ((int)i + 2 > len) {
                abort();
            }
        }
    }
}

For what it's worth, GCC 6.2.1 can't eliminate the checks in any of the above C snippets either, so this doesn't seem like a major deficiency in LLVM. In the end, kudos to the Go folks for beating industry standard C compilers, but this seems like LLVM's problem, not Rust's.

@StefanoD
Copy link
Author

StefanoD commented Sep 16, 2016

So, does it make any sense to extend the old LLVM bug report?
@Deewiant Thank you, very much!!!

@Deewiant
Copy link
Contributor

No, I don't think it does. I thought about this some more and did my research more thoroughly and came to the conclusion that while a fix for the old bug 810 might be helpful it's not really pertinent to this case. I'll edit my comment.

Instead, this is a combo of two things:

  1. LLVM does in fact canonicalize i + 1 > x as i >= x (under the assumption that i + 1 does not overflow), but only for signed comparisons! I submitted a patch for the unsigned case, so hopefully this will be taken care of soon: https://reviews.llvm.org/D24700. In terms of the Rust examples this would allow recognizing that i + 1 > s.len() is equivalent to i >= s.len(), which is then recognized as the negation of i < s.len() and so can be assumed to be false inside the loop.
  2. LLVM can't recognize that i + 1 does not overflow even though we know that i < s.len(). Previously I assumed that Rust integers had undefined behaviour on overflow, but I noticed that the emitted LLVM didn't match that assumption and Myths and Legends about Integer Overflow in Rust confirmed that this is not the case. Therefore we need LLVM to infer that overflow is not possible here, so that it can proceed with the canonicalization of i + 1 > x to i >= x. I filed LLVM#30428 for this because I'm not sure how it should be implemented.

For the general case such as my more obtuse example where I used len - 2 and i + 2, I submitted LLVM#30429, but that one's not important for this issue.

@StefanoD
Copy link
Author

StefanoD commented Sep 17, 2016

I appreciate your efforts! – Thank you, very much!

@Mark-Simulacrum Mark-Simulacrum added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Jul 26, 2017
@nox
Copy link
Contributor

nox commented Apr 2, 2018

@eddyb I wonder if the root cause of this issue is related to the one of #13018.

Real world code sometimes cannot just use the slice iterator etc and must instead rely on using ranges of indices, it would be nice to be able to fix this, though I suspect this is a pretty difficult task.

Godbolt playground with f5 and obtuse, where we can see the bound checks still exist in current nightly: https://godbolt.org/g/e4NpSz

Cc @rust-lang/wg-codegen

@Restioson
Copy link

Restioson commented May 1, 2018

Perhaps miri could help here to more generally find where bounds checks can be removed? I don't know much about how, but perhaps if it could figure out the way to push up the index as much as it can then it can interpret it and figure out whether it's still out of bounds in that situation?

@nikic
Copy link
Contributor

nikic commented Feb 27, 2021

Looks like there's no bounds checks on f5 either anymore (since 1.45): https://rust.godbolt.org/z/f3aja5

With that, I think we can call this resolved, and other bounds check elimination failures should be reported separately.

@nikic nikic closed this as completed Feb 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants