Skip to content

Commit 8abf920

Browse files
committed
Auto merge of #117722 - okaneco:binarysearch, r=thomcc
Refactor `binary_search_by` to use conditional moves Refactor the if/else checking on `cmp::Ordering` variants to a "branchless" reassignment of left and right. This change results in fewer branches and instructions. https://rust.godbolt.org/z/698eYffTx --- I saw consistent benchmark improvements locally. Performance of worst case seems about the same, maybe slightly faster for the L3 test. Current ``` slice::binary_search_l1 43.00ns/iter +/- 3.00ns slice::binary_search_l1_with_dups 25.00ns/iter +/- 0.00ns slice::binary_search_l1_worst_case 10.00ns/iter +/- 0.00ns slice::binary_search_l2 64.00ns/iter +/- 1.00ns slice::binary_search_l2_with_dups 42.00ns/iter +/- 0.00ns slice::binary_search_l2_worst_case 16.00ns/iter +/- 0.00ns slice::binary_search_l3 132.00ns/iter +/- 2.00ns slice::binary_search_l3_with_dups 108.00ns/iter +/- 2.00ns slice::binary_search_l3_worst_case 33.00ns/iter +/- 3.00ns ``` This PR ``` slice::binary_search_l1 21.00ns/iter +/- 0.00ns slice::binary_search_l1_with_dups 14.00ns/iter +/- 0.00ns slice::binary_search_l1_worst_case 9.00ns/iter +/- 0.00ns slice::binary_search_l2 34.00ns/iter +/- 0.00ns slice::binary_search_l2_with_dups 23.00ns/iter +/- 0.00ns slice::binary_search_l2_worst_case 16.00ns/iter +/- 0.00ns slice::binary_search_l3 92.00ns/iter +/- 3.00ns slice::binary_search_l3_with_dups 63.00ns/iter +/- 1.00ns slice::binary_search_l3_worst_case 29.00ns/iter +/- 0.00ns ```
2 parents 41fe75e + d585eec commit 8abf920

File tree

1 file changed

+8
-9
lines changed

1 file changed

+8
-9
lines changed

library/core/src/slice/mod.rs

+8-9
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
77
#![stable(feature = "rust1", since = "1.0.0")]
88

9-
use crate::cmp::Ordering::{self, Greater, Less};
9+
use crate::cmp::Ordering::{self, Equal, Greater, Less};
1010
use crate::fmt;
1111
use crate::intrinsics::{assert_unsafe_precondition, exact_div};
1212
use crate::marker::Copy;
@@ -2854,14 +2854,13 @@ impl<T> [T] {
28542854
// we have `left + size/2 < self.len()`, and this is in-bounds.
28552855
let cmp = f(unsafe { self.get_unchecked(mid) });
28562856

2857-
// The reason why we use if/else control flow rather than match
2858-
// is because match reorders comparison operations, which is perf sensitive.
2859-
// This is x86 asm for u8: https://rust.godbolt.org/z/8Y8Pra.
2860-
if cmp == Less {
2861-
left = mid + 1;
2862-
} else if cmp == Greater {
2863-
right = mid;
2864-
} else {
2857+
// This control flow produces conditional moves, which results in
2858+
// fewer branches and instructions than if/else or matching on
2859+
// cmp::Ordering.
2860+
// This is x86 asm for u8: https://rust.godbolt.org/z/698eYffTx.
2861+
left = if cmp == Less { mid + 1 } else { left };
2862+
right = if cmp == Greater { mid } else { right };
2863+
if cmp == Equal {
28652864
// SAFETY: same as the `get_unchecked` above
28662865
unsafe { crate::intrinsics::assume(mid < self.len()) };
28672866
return Ok(mid);

0 commit comments

Comments
 (0)