core: add likely and unlikely intrinsics #36181

seanmonstar · 2016-08-31T23:12:16Z

I'm no good at reading assembly, but I have tried a stage1 compiler with this patch, and it does cause different asm output. Additionally, testing this compiler on my httparse crate with some likely usage added in to the branches does affect benchmarks. However, I'm sure a codegen test should be included, if anyone knows what it should look like.

There isn't an entry in librustc_trans/context.rs in this diff, because it already exists (llvm.expect.i1 is used for array indices).

Even though this does affect httparse benchmarks, it doesn't seem to affect it the same way GCC's __builtin_expect affects picohttpparser. I was confused that the deviation on the benchmarks grew hugely when testing this, especially since I'm absolutely certain that the branchs where I added likely were always true. I chalk that up to GCC and LLVM handle branch prediction differently.

cc #26179

rust-highfive · 2016-08-31T23:12:28Z

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

seanmonstar · 2016-08-31T23:13:18Z

src/libcore/intrinsics.rs

@@ -194,6 +194,14 @@ extern "rust-intrinsic" {
    /// own, or if it does not enable any significant optimizations.
    pub fn assume(b: bool);

+    #[cfg(not(stage0))]
+    /// dox


I know the docs need filling in.

Aatch · 2016-09-01T00:36:32Z

The main thing to check is that this:

pub fn test(a: u32, b: u32) -> Option<u32> {
    if likely(a == b) {
        None
    } else {
        Some(a + b)
    }
}

Produces something similar to this (without optimisations on):

define i64 @test(i32, i32) unnamed_addr #0 {
entry-block:
  %sret_slot = alloca %"2.std::option::Option<u32>"
  %return = alloca %"2.std::option::Option<u32>"
  br label %start

start:                                            ; preds = %entry-block
  %2 = icmp eq i32 %0, %1
  %3 = call zeroext i1 @llvm.expect.i1(i1 zeroext %2, true)
  br label %bb1

bb1:                                              ; preds = %start
  br i1 %3, label %bb2, label %bb3

bb2:                                              ; preds = %bb1
  %4 = getelementptr inbounds %"2.std::option::Option<u32>", %"2.std::option::Option<u32>"* %return, i32 0, i32 0
  store i32 0, i32* %4
  br label %bb5

bb3:                                              ; preds = %bb1
  %5 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %0, i32 %1)
  %6 = extractvalue { i32, i1 } %5, 0
  %7 = extractvalue { i32, i1 } %5, 1
  %8 = call i1 @llvm.expect.i1(i1 %7, i1 false)
  br i1 %8, label %panic, label %bb4

bb4:                                              ; preds = %bb3
  %9 = getelementptr inbounds %"2.std::option::Option<u32>", %"2.std::option::Option<u32>"* %return, i32 0, i32 0
  store i32 1, i32* %9
  %10 = bitcast %"2.std::option::Option<u32>"* %return to { i32, i32 }*
  %11 = getelementptr inbounds { i32, i32 }, { i32, i32 }* %10, i32 0, i32 1
  store i32 %6, i32* %11
  br label %bb5

bb5:                                              ; preds = %bb2, %bb4
  %12 = bitcast %"2.std::option::Option<u32>"* %return to i64*
  %13 = load i64, i64* %12, align 4
  ret i64 %13

panic:                                            ; preds = %bb3
  call void @_ZN4core9panicking5panic17h44f94ad2f4e3e170E({ %str_slice, %str_slice, i32 }* @panic_loc7544)
  unreachable
}

The point of interest being this:

  %2 = icmp eq i32 %0, %1
  %3 = call zeroext i1 @llvm.expect.i1(i1 zeroext %2, true)
  br label %bb1

As far as I know, this pattern is pretty much what LLVM is looking for when lowering the expect intrinsic, and won't detect much else. The main thing (and what we didn't have before) is the immediate use of the result of the intrinsic in the branch condition.

Note that the IR here was generated with a close-to-master build of the compiler, and tweaked slightly by hand so I'm confident that this patch is producing the correct IR.

Aatch · 2016-09-01T00:46:06Z

For the documentation, it should definitely be noted that any use other than if likely(cond) { ... } will likely not have an effect.

nagisa · 2016-09-01T20:13:02Z

Previous PR: #26429

nagisa · 2016-09-01T20:14:33Z

Closes #26179

Should we be closing tracking issues before we expose the stable interfaces to the intrinsics?

nagisa · 2016-09-01T20:36:53Z

I can’t seem to coerce clang > 3.6 into generating this intrinsic OR branch weight metadata. I have a feeling that this, combined with your observed increase in variance (I observed a similar one a while ago), may be a good indicator that we ought not to implement any such intrinsic at all, or at least find a number of cases which show a very clear benefit.

nagisa · 2016-09-01T20:49:29Z

#llvm told me expect intrinsic is more harmful than it is helpful and that the branch weight metadata should be emitted instead. Might be worth to go back and reconsider the RFC, since the branch weight metadata is more powerful and expressive and would be better expressed as attributes in source too.

why is clang > 3.6 not emiting expect intrinsic for __builtin_expect?
nagisa: it turns out that the expect intrinsic overall harmed the optimizer by inhibiting certain optimizations
so we now directly lower __builtin_expect to branch probabilities in the frontend when generating IR

seanmonstar · 2016-09-02T00:10:56Z

@nagisa I had similar feelings. I am also left wondering if the current llvm.expect.i1 usage internally (bounds checks, overflow checks, ...) might actually likewise cause slowness instead of optimizations.

seanmonstar · 2016-09-02T01:06:40Z

Pushed documentation and a codegen test.

Aatch · 2016-09-02T07:01:48Z

@nagisa that was something I thought about. I think we should merge this PR for now, then look at improving the implementation. The advantage of using a function (i.e. intrinsic) is that they just automatically handle stuff like short-circuiting, it's also not clear where the best place for attributes would be in an if-else(-if) expression.

nagisa · 2016-09-02T13:07:14Z

I think we should merge this PR for now, then look at improving the implementation.

My concern here that there might be a nicer way to take advantage of branch metadata and expose more power. For example this compiles (with some features enabled):

match Ok::<_, ()>(0) {
    Ok(0) => { 
        #![most_likely] 
        ::std::process::exit(0) 
    },
    Ok(_) => { 
        #![not_that_likely] 
        ::std::process::exit(1) 
    },
    Err(_) => { #![unlikely] ::std::process::exit(42) }
}

I feel like something similar could be made possible by allowing attributes for branch bodies in if-else.

I also think likely/unlikely aren’t very easy to work with and would be quite hard to implement them to lower to branch metadata.

I am also left wondering if the current llvm.expect.i1 usage internally (bounds checks, overflow checks, ...) might actually likewise cause slowness instead of optimizations.

Its certainly a possibility, especially for non-x86 targets.

nikomatsakis · 2016-09-02T18:31:20Z

Hmm. I don't really know what to do here. =)

@rust-lang/compiler -- thoughts? The summary is that @seanmonstar did awesome work implementing likely/unlikely intrinsics, but now there is some second guessing as to whether this is the right approach. Should we land this work? Back off and redesign a bit more?

Right now I'm leaning towards: land the PR, carry on this discussion in the tracking issue -- but this kind of feedback is definitely a negative strike for stabilization.

nikomatsakis · 2016-09-02T18:36:43Z

The implementation at least looks good to me. Simple.

nagisa · 2016-09-02T18:55:08Z

src/test/codegen/likely.rs

+#[no_mangle]
+pub fn check_likely(x: i32, y: i32) -> Option<i32> {
+    unsafe {
+        // CHECK: call i1 @llvm.expect.i1(i1 %{{.*}}, i1 true)


Could you check some surroundings around this intrinsic too? I.e. that icmp or something goes before the intrinsic and that branch goes right after the intrinsic?

It would look something along the lines of

CHECK: [[cond:%[0-9]+]] = icmp ... CHECK-NEXT: call i1 @llvm.expect.i1(i1 [[cond]], i1 true) CHECK-NEXT: br [[cond]] ...

It seems there's a couple of casts and things in between. This is the relevant part of the likely.ll file:

start: %2 = icmp eq i32 %0, %1 %3 = call i1 @llvm.expect.i1(i1 %2, i1 true) %4 = zext i1 %3 to i8 store i8 %4, i8* %tmp_ret %5 = load i8, i8* %tmp_ret, !range !0 %6 = trunc i8 %5 to i1 br label %bb1 bb1: br i1 %6, label %bb2, label %bb3

nagisa · 2016-09-03T01:23:18Z

Okay, I guess I’m fine with landing this in its current state as an unstable intrinsic, but I’ll have my reservations when we get to stabilisation effort.

seanmonstar · 2016-09-03T02:39:14Z

I'm not sure why travis is failing. Looking through the raw log, I don't find an instance of a test failing... It just claims to "exit with 2"...

durka · 2016-09-03T17:52:08Z

@seanmonstar it's just #36138.

seanmonstar · 2016-09-08T18:29:28Z

Good to go?

nikomatsakis · 2016-09-13T14:13:52Z

@bors r+

bors · 2016-09-13T14:13:53Z

📌 Commit b778f7f has been approved by nikomatsakis

nikomatsakis · 2016-09-13T14:14:04Z

@nagisa

I’ll have my reservations when we get to stabilisation effort.

agreed

bors · 2016-09-13T15:16:47Z

⌛ Testing commit b778f7f with merge e4ee8e8...

bors · 2016-09-13T15:24:06Z

💔 Test failed - auto-win-gnu-32-opt

alexcrichton · 2016-09-13T15:26:18Z

@bors: retry

On Tue, Sep 13, 2016 at 8:24 AM, bors notifications@github.com wrote:

💔 Test failed - auto-win-gnu-32-opt
https://buildbot.rust-lang.org/builders/auto-win-gnu-32-opt/builds/5621

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#36181 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAD95KGY5i525JGsV1eEQ81_CkzN8MlBks5qpsAZgaJpZM4JyKM4
.

bors · 2016-09-13T15:26:43Z

⌛ Testing commit b778f7f with merge bd254b8...

alexcrichton · 2016-09-13T15:35:16Z

@bors: retry force clean

bors · 2016-09-13T17:54:55Z

⌛ Testing commit b778f7f with merge 2fd0608...

core: add likely and unlikely intrinsics I'm no good at reading assembly, but I have tried a stage1 compiler with this patch, and it does cause different asm output. Additionally, testing this compiler on my httparse crate with some `likely` usage added in to the branches does affect benchmarks. However, I'm sure a codegen test should be included, if anyone knows what it should look like. There isn't an entry in `librustc_trans/context.rs` in this diff, because it already exists (`llvm.expect.i1` is used for array indices). ---- Even though this does affect httparse benchmarks, it doesn't seem to affect it the same way GCC's `__builtin_expect` affects picohttpparser. I was confused that the deviation on the benchmarks grew hugely when testing this, especially since I'm absolutely certain that the branchs where I added `likely` were always `true`. I chalk that up to GCC and LLVM handle branch prediction differently. cc #26179

bors · 2016-09-13T22:08:10Z

rust-highfive assigned nikomatsakis Aug 31, 2016

seanmonstar reviewed Aug 31, 2016
View reviewed changes

seanmonstar force-pushed the likely branch from b9e66a5 to 068dec8 Compare September 2, 2016 01:06

core: add likely and unlikely intrinsics

b778f7f

seanmonstar force-pushed the likely branch from 068dec8 to b778f7f Compare September 2, 2016 17:29

nagisa mentioned this pull request Sep 2, 2016

Tracking issue for stmt_expr_attributes: Add attributes to expressions, etc. #15701

Open

5 tasks

nagisa reviewed Sep 2, 2016
View reviewed changes

Aatch mentioned this pull request Sep 2, 2016

Meta tracking issue for branch hints (RFC 1131) #26179

Open

bors merged commit b778f7f into rust-lang:master Sep 13, 2016

pmarcelll mentioned this pull request Sep 18, 2016

Implement attributes for branch weight #11092

Closed

core: add likely and unlikely intrinsics #36181

core: add likely and unlikely intrinsics #36181

Uh oh!

Conversation

seanmonstar commented Aug 31, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rust-highfive commented Aug 31, 2016

Uh oh!

seanmonstar Aug 31, 2016

Choose a reason for hiding this comment

Uh oh!

Aatch commented Sep 1, 2016

Uh oh!

Aatch commented Sep 1, 2016

Uh oh!

nagisa commented Sep 1, 2016

Uh oh!

nagisa commented Sep 1, 2016

Uh oh!

nagisa commented Sep 1, 2016

Uh oh!

nagisa commented Sep 1, 2016

Uh oh!

seanmonstar commented Sep 2, 2016

Uh oh!

seanmonstar commented Sep 2, 2016

Uh oh!

Aatch commented Sep 2, 2016

Uh oh!

nagisa commented Sep 2, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikomatsakis commented Sep 2, 2016

Uh oh!

nikomatsakis commented Sep 2, 2016

Uh oh!

nagisa Sep 2, 2016

Choose a reason for hiding this comment

Uh oh!

seanmonstar Sep 3, 2016

Choose a reason for hiding this comment

Uh oh!

nagisa commented Sep 3, 2016

Uh oh!

seanmonstar commented Sep 3, 2016

Uh oh!

durka commented Sep 3, 2016

Uh oh!

seanmonstar commented Sep 8, 2016

Uh oh!

nikomatsakis commented Sep 13, 2016

Uh oh!

bors commented Sep 13, 2016

Uh oh!

nikomatsakis commented Sep 13, 2016

Uh oh!

bors commented Sep 13, 2016

Uh oh!

bors commented Sep 13, 2016

Uh oh!

alexcrichton commented Sep 13, 2016

Uh oh!

bors commented Sep 13, 2016

Uh oh!

alexcrichton commented Sep 13, 2016

Uh oh!

bors commented Sep 13, 2016

Uh oh!

bors commented Sep 13, 2016

Uh oh!

Uh oh!

seanmonstar commented Aug 31, 2016 •

edited

Loading

nagisa commented Sep 2, 2016 •

edited

Loading