Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer range checks should lift outside of lossless truncations #60683

Open
scottmcm opened this issue Feb 12, 2023 · 0 comments
Open

Integer range checks should lift outside of lossless truncations #60683

scottmcm opened this issue Feb 12, 2023 · 0 comments

Comments

@scottmcm
Copy link

I was experimenting in Rust with implementing the ASCII versions of certain unicode predicates in terms of an implementation on ASCII bytes, something like this https://rust.godbolt.org/z/c495YKTsM

pub fn demo_is_ascii_digit(c: char) -> bool {
    if let Some(c) = AsciiChar::new(c) {
        c.is_digit()
    } else {
        false
    }
}

Unfortunately, that optimizes much worse than checking against the 32-bit type directly:

define noundef zeroext i1 @_ZN10playground19demo_is_ascii_digit17h37e7c57418c2459fE(i32 noundef %c) unnamed_addr #0 {
start:
  %_4.i = icmp ult i32 %c, 128
  %_7.i = trunc i32 %c to i8
  %0 = add i8 %_7.i, -48
  %1 = icmp ult i8 %0, 10
  %.0 = and i1 %_4.i, %1
  ret i1 %.0
}

define noundef zeroext i1 @_ZN10playground18std_is_ascii_digit17he953dd4cafd53a1eE(i32 noundef %0) unnamed_addr #0 {
start:
  %1 = add i32 %0, -48
  %2 = icmp ult i32 %1, 10
  ret i1 %2
}

Alive2 confirms that converting the former there into the latter is legal: https://alive2.llvm.org/ce/z/_vhQJE

But opt seemingly can't do it today: https://llvm.godbolt.org/z/zdsEsvrsP

Perhaps the way to get there would be to do something like this:

 start:
   %_4.i = icmp ult i32 %c, 128
-  %_7.i = trunc i32 %c to i8
-  %0 = add i8 %_7.i, -48
+  %0 = add i32 %c, -48
-  %1 = icmp ult i8 %0, 10
+  %1 = icmp ult i32 %0, 10
   %.0 = and i1 %_4.i, %1
   ret i1 %.0

Since alive2 says that's legal https://alive2.llvm.org/ce/z/bETvXP and after that other existing optimizations can get the rest of the way: https://llvm.godbolt.org/z/aEhThW96M

I tried to make a C++ repro as well, which came out with a slightly different, but still suboptimal, output: https://cpp.godbolt.org/z/r5xGxa8bc

bool is_ascii_char(uint32_t c) {
    if (c < 128) {
        auto b = (uint8_t)c;
        return (b >= (uint8_t)'0') && (b <= (uint8_t)'9');
    } else {
        return false;
    }
}
define dso_local noundef zeroext i1 @_Z13is_ascii_charj(i32 noundef %0) local_unnamed_addr #0 {
  %2 = icmp ult i32 %0, 128
  %3 = and i32 %0, 254    ; <-- the comparisons got widened, but the `and` didn't disappear
  %4 = add nsw i32 %3, -48
  %5 = icmp ult i32 %4, 10
  %6 = select i1 %2, i1 %5, i1 false
  ret i1 %6
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants