-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: add likely and unlikely intrinsics #36181
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
@@ -194,6 +194,14 @@ extern "rust-intrinsic" { | |||
/// own, or if it does not enable any significant optimizations. | |||
pub fn assume(b: bool); | |||
|
|||
#[cfg(not(stage0))] | |||
/// dox |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the docs need filling in.
The main thing to check is that this: pub fn test(a: u32, b: u32) -> Option<u32> {
if likely(a == b) {
None
} else {
Some(a + b)
}
} Produces something similar to this (without optimisations on): define i64 @test(i32, i32) unnamed_addr #0 {
entry-block:
%sret_slot = alloca %"2.std::option::Option<u32>"
%return = alloca %"2.std::option::Option<u32>"
br label %start
start: ; preds = %entry-block
%2 = icmp eq i32 %0, %1
%3 = call zeroext i1 @llvm.expect.i1(i1 zeroext %2, true)
br label %bb1
bb1: ; preds = %start
br i1 %3, label %bb2, label %bb3
bb2: ; preds = %bb1
%4 = getelementptr inbounds %"2.std::option::Option<u32>", %"2.std::option::Option<u32>"* %return, i32 0, i32 0
store i32 0, i32* %4
br label %bb5
bb3: ; preds = %bb1
%5 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %0, i32 %1)
%6 = extractvalue { i32, i1 } %5, 0
%7 = extractvalue { i32, i1 } %5, 1
%8 = call i1 @llvm.expect.i1(i1 %7, i1 false)
br i1 %8, label %panic, label %bb4
bb4: ; preds = %bb3
%9 = getelementptr inbounds %"2.std::option::Option<u32>", %"2.std::option::Option<u32>"* %return, i32 0, i32 0
store i32 1, i32* %9
%10 = bitcast %"2.std::option::Option<u32>"* %return to { i32, i32 }*
%11 = getelementptr inbounds { i32, i32 }, { i32, i32 }* %10, i32 0, i32 1
store i32 %6, i32* %11
br label %bb5
bb5: ; preds = %bb2, %bb4
%12 = bitcast %"2.std::option::Option<u32>"* %return to i64*
%13 = load i64, i64* %12, align 4
ret i64 %13
panic: ; preds = %bb3
call void @_ZN4core9panicking5panic17h44f94ad2f4e3e170E({ %str_slice, %str_slice, i32 }* @panic_loc7544)
unreachable
} The point of interest being this: %2 = icmp eq i32 %0, %1
%3 = call zeroext i1 @llvm.expect.i1(i1 zeroext %2, true)
br label %bb1 As far as I know, this pattern is pretty much what LLVM is looking for when lowering the Note that the IR here was generated with a close-to-master build of the compiler, and tweaked slightly by hand so I'm confident that this patch is producing the correct IR. |
For the documentation, it should definitely be noted that any use other than |
Previous PR: #26429 |
Should we be closing tracking issues before we expose the stable interfaces to the intrinsics? |
I can’t seem to coerce clang > 3.6 into generating this intrinsic OR branch weight metadata. I have a feeling that this, combined with your observed increase in variance (I observed a similar one a while ago), may be a good indicator that we ought not to implement any such intrinsic at all, or at least find a number of cases which show a very clear benefit. |
#llvm told me
|
@nagisa I had similar feelings. I am also left wondering if the current |
Pushed documentation and a codegen test. |
@nagisa that was something I thought about. I think we should merge this PR for now, then look at improving the implementation. The advantage of using a function (i.e. intrinsic) is that they just automatically handle stuff like short-circuiting, it's also not clear where the best place for attributes would be in an if-else(-if) expression. |
My concern here that there might be a nicer way to take advantage of branch metadata and expose more power. For example this compiles (with some features enabled): match Ok::<_, ()>(0) {
Ok(0) => {
#![most_likely]
::std::process::exit(0)
},
Ok(_) => {
#![not_that_likely]
::std::process::exit(1)
},
Err(_) => { #![unlikely] ::std::process::exit(42) }
} I feel like something similar could be made possible by allowing attributes for branch bodies in if-else. I also think
Its certainly a possibility, especially for non-x86 targets. |
Hmm. I don't really know what to do here. =) @rust-lang/compiler -- thoughts? The summary is that @seanmonstar did awesome work implementing likely/unlikely intrinsics, but now there is some second guessing as to whether this is the right approach. Should we land this work? Back off and redesign a bit more? Right now I'm leaning towards: land the PR, carry on this discussion in the tracking issue -- but this kind of feedback is definitely a negative strike for stabilization. |
The implementation at least looks good to me. Simple. |
#[no_mangle] | ||
pub fn check_likely(x: i32, y: i32) -> Option<i32> { | ||
unsafe { | ||
// CHECK: call i1 @llvm.expect.i1(i1 %{{.*}}, i1 true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check some surroundings around this intrinsic too? I.e. that icmp
or something goes before the intrinsic and that branch goes right after the intrinsic?
It would look something along the lines of
CHECK: [[cond:%[0-9]+]] = icmp ...
CHECK-NEXT: call i1 @llvm.expect.i1(i1 [[cond]], i1 true)
CHECK-NEXT: br [[cond]] ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there's a couple of casts and things in between. This is the relevant part of the likely.ll
file:
start:
%2 = icmp eq i32 %0, %1
%3 = call i1 @llvm.expect.i1(i1 %2, i1 true)
%4 = zext i1 %3 to i8
store i8 %4, i8* %tmp_ret
%5 = load i8, i8* %tmp_ret, !range !0
%6 = trunc i8 %5 to i1
br label %bb1
bb1:
br i1 %6, label %bb2, label %bb3
Okay, I guess I’m fine with landing this in its current state as an unstable intrinsic, but I’ll have my reservations when we get to stabilisation effort. |
I'm not sure why travis is failing. Looking through the raw log, I don't find an instance of a test failing... It just claims to "exit with 2"... |
@seanmonstar it's just #36138. |
Good to go? |
@bors r+ |
📌 Commit b778f7f has been approved by |
agreed |
⌛ Testing commit b778f7f with merge e4ee8e8... |
💔 Test failed - auto-win-gnu-32-opt |
@bors: retry On Tue, Sep 13, 2016 at 8:24 AM, bors notifications@github.com wrote:
|
⌛ Testing commit b778f7f with merge bd254b8... |
@bors: retry force clean |
⌛ Testing commit b778f7f with merge 2fd0608... |
core: add likely and unlikely intrinsics I'm no good at reading assembly, but I have tried a stage1 compiler with this patch, and it does cause different asm output. Additionally, testing this compiler on my httparse crate with some `likely` usage added in to the branches does affect benchmarks. However, I'm sure a codegen test should be included, if anyone knows what it should look like. There isn't an entry in `librustc_trans/context.rs` in this diff, because it already exists (`llvm.expect.i1` is used for array indices). ---- Even though this does affect httparse benchmarks, it doesn't seem to affect it the same way GCC's `__builtin_expect` affects picohttpparser. I was confused that the deviation on the benchmarks grew hugely when testing this, especially since I'm absolutely certain that the branchs where I added `likely` were always `true`. I chalk that up to GCC and LLVM handle branch prediction differently. cc #26179
I'm no good at reading assembly, but I have tried a stage1 compiler with this patch, and it does cause different asm output. Additionally, testing this compiler on my httparse crate with some
likely
usage added in to the branches does affect benchmarks. However, I'm sure a codegen test should be included, if anyone knows what it should look like.There isn't an entry in
librustc_trans/context.rs
in this diff, because it already exists (llvm.expect.i1
is used for array indices).Even though this does affect httparse benchmarks, it doesn't seem to affect it the same way GCC's
__builtin_expect
affects picohttpparser. I was confused that the deviation on the benchmarks grew hugely when testing this, especially since I'm absolutely certain that the branchs where I addedlikely
were alwaystrue
. I chalk that up to GCC and LLVM handle branch prediction differently.cc #26179