Skip to content

ACP: add least_significant_one and most_significant_one to integer types and NonZero types #467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
orlp opened this issue Oct 21, 2024 · 5 comments
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@orlp
Copy link

orlp commented Oct 21, 2024

Proposal

Problem statement

It is common in bit-twiddling code to want to isolate the most significant set bit or the least significant set bit on an integer, for a very wide variety of purposes. Currently in Rust it is quite error prone to do this, for the least significant bit you must remember the following incantation:

let lsb = x & x.wrapping_neg();

and for the most significant bit the following:

let msb = x & ((1 << (T::BITS - 1)).wrapping_shr(x.leading_zeros()));

These are just the most efficient implementations I've come up with, on x86-64 they compile to 1 and 4 branchless instructions respectively:

.lsb
        blsi    rax, rdi

.msb
        lzcnt   rax, rdi
        movabs  rcx, -9223372036854775808
        shrx    rax, rcx, rax
        and     rax, rdi

Most people will likely find or come up with worse implementations, or worse, buggy ones. Off-by-one errors in shift amounts, or accidental wrapping of shift amounts are common.

Motivating examples or use cases

As mentioned, these functions are widely used in all kinds of bit twiddling. Here are two concrete usage examples but these don't even begin to scratch the surface of the ways in which they could be used.

Summing elements from an array masked by a bitset:

fn masked_sum(x: &[f32; 32], mut m: u32) -> f32 {
    let mut s = 0.0;
    while m != 0 {
        s += x[m.trailing_zeros() as usize];
        m ^= m.least_significant_one();
    }
    s
}

Doing a bitwise binary search on an array:

fn first_greater_or_equal(haystack: &[u32], needle: u32) -> usize {
    let mut bit = haystack.len().most_significant_one();
    let mut idx = 0;
    while bit != 0 {
        if let Some(v) = haystack.get((idx | bit) - 1) {
            if *v < needle {
                idx |= bit;
            }
        }
    }
}

Solution sketch

I propose we add the following two functions to all integer types, and all NonZeroT types:

/// Returns the input with all bits except the least significant set bit masked out. Returns 0 if the input is 0.
///
/// # Examples
/// assert_eq!(u8::least_significant_one(0b01100100), 0b00000100);
/// assert_eq!(u8::least_significant_one(0), 0);
fn least_significant_one(self) -> Self;

/// Returns the input with all bits except the most significant set bit masked out. Returns 0 if the input is 0.
///
/// # Examples
/// assert_eq!(u8::most_significant_one(0b01100100), 0b01000000);
/// assert_eq!(u8::most_significant_one(0), 0);
fn most_significant_one(self) -> Self;

While one could potentially argue that for the normal integer types they should return Option<Self>, this is undesirable. There are efficient implementations available that do not check for the zero input, and checking for zero adds extra branches that the caller almost surely does not want. It is always trivial to add a zero check yourself if the behavior at zero is not what you want, but in my experience zero-in zero-out is always what you want for these functions.

The NonZero variants never have this concern and can always return Self in any design, as there is guaranteed to be a bit set.

Alternatives

For most_significant_one there is uN::next_power_of_two defined on unsigned integers which almost-but-not-quite does what you want, and a proposal for prev_power_of_two which does what you want but not for the zero input. However these methods:

  • are unavailable for signed integers,
  • have nasty edge cases around the maximum value or 0 (they always need to output a power of two, which is nasty with zero, and they can wrap for next), and
  • have a questionable name since they don't actually provide the next/prev power of two if the input is already a power of two, making their function ambiguous/confusing for the unaware.

The least_significant_one and most_significant_one are well-defined for both signed and unsigned integers on all inputs except zero, and can trivially be extended to zero without conflicting with the "must return a power of two" requirement.

Alternative names

To get some bikeshedding out of the way, here are some alternative names:

  • least_sig_one / most_sig_one
  • lowest_one / highest_one
  • lowest_bit_set / highest_bit_set

Links and related work

@orlp orlp added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Oct 21, 2024
@orlp orlp changed the title ACP: add least_significant_one and most_significant_one to integer types and NonZero types ACP: add least_significant_one and most_significant_one to integer types and NonZero types Oct 21, 2024
@Amanieu
Copy link
Member

Amanieu commented Oct 22, 2024

We discussed this in the libs-api meeting. We're happy to accept this, but prefer the names isolate_least_significant_one and isolate_most_significant_one instead since it clearly indicates that an integer with just one bit set is returned. The previously proposed name could be confused with returning a bit index.

@Amanieu Amanieu closed this as completed Oct 22, 2024
@Amanieu Amanieu added the ACP-accepted API Change Proposal is accepted (seconded with no objections) label Oct 22, 2024
@kennytm
Copy link
Member

kennytm commented Oct 22, 2024

in that case would isolate_{most,least}_significant_bit be clear and also fitting existing nomenclature

@the8472
Copy link
Member

the8472 commented Oct 22, 2024

we considered that but it could just refer to the MSB in an int and mask that. That would be a rather useless method, but still, we're not looking for any bit but for a bit set to one.

@orlp
Copy link
Author

orlp commented Oct 22, 2024

I think isolate_{most,least}_significant_one is getting a bit on the wordy side, it would be the longest method name on integers.

We already use the terminology "high-order" bits in the docs, so I don't think high/low are ambiguous, so perhaps we could use

  • isolate_highest_one
  • isolate_lowest_one

instead? Regardless, I'm happy to see it accepted.

@cuviper
Copy link
Member

cuviper commented Oct 22, 2024

We have precedents for "leading" and "trailing" methods too -- though maybe that doesn't help because MSB would mean "skip leading zeros and return the one", which isn't really a "leading" one.

compiler-errors added a commit to compiler-errors/rust that referenced this issue Feb 22, 2025
Implement feature `isolate_most_least_significant_one` for integer types

Accepted ACP - rust-lang/libs-team#467
Tracking issue - rust-lang#136909

Implement ACP for functions that isolate the most significant set bit and least significant set bit on unsigned, signed, and `NonZero` integers.

Add function `isolate_most_significant_one`
Add function `isolate_least_significant_one`

---

This PR adds the following impls
```rust
impl {u8, u16, u32, u64, u128, usize} {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
impl {i8, i16, i32, i64, i128, isize} {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
impl NonZeroT {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
```
Example behavior
```rust
assert_eq!(u8::isolate_most_significant_one(0b01100100), 0b01000000);
assert_eq!(u8::isolate_least_significant_one(0b01100100), 0b00000100);
```
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 22, 2025
Implement feature `isolate_most_least_significant_one` for integer types

Accepted ACP - rust-lang/libs-team#467
Tracking issue - rust-lang#136909

Implement ACP for functions that isolate the most significant set bit and least significant set bit on unsigned, signed, and `NonZero` integers.

Add function `isolate_most_significant_one`
Add function `isolate_least_significant_one`

---

This PR adds the following impls
```rust
impl {u8, u16, u32, u64, u128, usize} {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
impl {i8, i16, i32, i64, i128, isize} {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
impl NonZeroT {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
```
Example behavior
```rust
assert_eq!(u8::isolate_most_significant_one(0b01100100), 0b01000000);
assert_eq!(u8::isolate_least_significant_one(0b01100100), 0b00000100);
```
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 22, 2025
Rollup merge of rust-lang#136910 - okaneco:sig_ones, r=thomcc

Implement feature `isolate_most_least_significant_one` for integer types

Accepted ACP - rust-lang/libs-team#467
Tracking issue - rust-lang#136909

Implement ACP for functions that isolate the most significant set bit and least significant set bit on unsigned, signed, and `NonZero` integers.

Add function `isolate_most_significant_one`
Add function `isolate_least_significant_one`

---

This PR adds the following impls
```rust
impl {u8, u16, u32, u64, u128, usize} {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
impl {i8, i16, i32, i64, i128, isize} {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
impl NonZeroT {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
```
Example behavior
```rust
assert_eq!(u8::isolate_most_significant_one(0b01100100), 0b01000000);
assert_eq!(u8::isolate_least_significant_one(0b01100100), 0b00000100);
```
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 11, 2025
Implement feature `isolate_most_least_significant_one` for integer types

Accepted ACP - rust-lang/libs-team#467
Tracking issue - rust-lang#136909

Implement ACP for functions that isolate the most significant set bit and least significant set bit on unsigned, signed, and `NonZero` integers.

Add function `isolate_most_significant_one`
Add function `isolate_least_significant_one`

---

This PR adds the following impls
```rust
impl {u8, u16, u32, u64, u128, usize} {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
impl {i8, i16, i32, i64, i128, isize} {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
impl NonZeroT {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
```
Example behavior
```rust
assert_eq!(u8::isolate_most_significant_one(0b01100100), 0b01000000);
assert_eq!(u8::isolate_least_significant_one(0b01100100), 0b00000100);
```
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 11, 2025
Implement feature `isolate_most_least_significant_one` for integer types

Accepted ACP - rust-lang/libs-team#467
Tracking issue - rust-lang#136909

Implement ACP for functions that isolate the most significant set bit and least significant set bit on unsigned, signed, and `NonZero` integers.

Add function `isolate_most_significant_one`
Add function `isolate_least_significant_one`

---

This PR adds the following impls
```rust
impl {u8, u16, u32, u64, u128, usize} {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
impl {i8, i16, i32, i64, i128, isize} {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
impl NonZeroT {
    const fn isolate_most_significant_one(self) -> Self;
    const fn isolate_least_significant_one(self) -> Self;
}
```
Example behavior
```rust
assert_eq!(u8::isolate_most_significant_one(0b01100100), 0b01000000);
assert_eq!(u8::isolate_least_significant_one(0b01100100), 0b00000100);
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

5 participants