Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for bigint helper methods #85532

Open
6 of 10 tasks
clarfonthey opened this issue May 21, 2021 · 90 comments
Open
6 of 10 tasks

Tracking Issue for bigint helper methods #85532

clarfonthey opened this issue May 21, 2021 · 90 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@clarfonthey
Copy link
Contributor

clarfonthey commented May 21, 2021

Feature gate: #![feature(bigint_helper_methods)]

This is a tracking issue for the following methods on integers:

  • carrying_add
  • borrowing_sub
  • carrying_mul
  • carrying_mul_add
  • widening_mul

These methods are intended to help centralise the effort required for creating efficient big integer implementations, by offering a few methods which would otherwise require special compiler intrinsics or custom assembly code in order to do efficiently. They do not alone constitute big integer implementations themselves, but are necessary building blocks for a larger implementation.

Public API

// On unsigned integers:

/// `self + rhs + carry` (full adder)
const fn carrying_add(self, rhs: Self, carry: bool) -> (Self, bool);

/// `self - rhs - carry` (full "subtractor")
const fn borrowing_sub(self, rhs: Self, carry: bool) -> (Self, bool);

/// `self * rhs + carry` (multiply-accumulate)
const fn carrying_mul(self, rhs: Self, carry: Self) -> (Self, Self);

/// `self * rhs + carry` (multiply-accumulate-carry)
const fn carrying_mul_add(self, rhs: Self, addend: Self, carry: Self) -> (Self, Self);

/// `self * rhs` (wide multiplication, same as `self.carrying_mul(rhs, 0)`)
const fn widening_mul(self, rhs: Self) -> (Self, Self);


// On signed integers:

/// `self + rhs + carry` (full adder)
const fn carrying_add(self, rhs: Self, carry: bool) -> (Self, bool);

/// `self - rhs - carry` (full "subtractor")
const fn borrowing_sub(self, rhs: Self, carry: bool) -> (Self, bool);

Steps / History

Unresolved Questions

  • Should these be implemented using compiler intrinsics? LLVM currently has no equivalents, so, we'd have to custom-build some.
  • Should an alternative API be provided for widening_mul that simply returns the next-larger type? What would we do for u128/i128?
  • What should the behaviour be for signed integers? Should there be implementations for signed integers at all?
  • Is the "borrowing" terminology worth it for subtraction, or should we simply call that "carrying" as well for consistency?
  • Are there other methods that should be added in addition to the existing ones?
@clarfonthey clarfonthey added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels May 21, 2021
@leonardo-m
Copy link

Are there other methods that should be added in addition to the existing ones?

I'd like a mul_mod, as shown in #85017, because I think you can't implement it efficiently without asm and it's a basic block for power_mod and other things.

@clarfonthey
Copy link
Contributor Author

clarfonthey commented May 21, 2021

Another set of methods that could be useful that I'll probably offer implementations for at some point:

/// `(self << rhs) | carry`
fn carrying_shl(self, rhs: u32, carry: Self) -> (Self, Self); 

/// `(self >> rhs) | carry`
fn borrowing_shr(self, rhs: u32, carry: Self) -> (Self, Self);

/// `self << rhs`
fn widening_shl(self, rhs: u32) -> (Self, Self);

/// `self >> rhs`
fn widening_shr(self, rhs: u32) -> (Self, Self);

Essentially, return the two halves of a rotation, i.e. x.widening_shl(y) is the same as (x << y, x >> (BITS - y)) and similarly for widening_shr. Not sure whether they should allow rhs == BITS or not, but presumably they wouldn't for consistency with existing shift methods.

@clarfonthey
Copy link
Contributor Author

From @scottmcm in the original PR:

Some prior art I happened upon: https://docs.rs/cranelift-codegen/0.74.0/cranelift_codegen/ir/trait.InstBuilder.html#method.isub_bin

Same as isub with an additional borrow flag input. Computes:

   a = x - (y + b_{in}) \pmod 2^B

@photino
Copy link

photino commented Sep 7, 2021

Why don't we add carrying_mul and widening_mul for i128/u128 as well?

@clarfonthey
Copy link
Contributor Author

Mostly effort implementing them efficiently. In the meantime, you can do it with four calls to the u64 version. Or three if you want to be fancy.

@RalfJung
Copy link
Member

RalfJung commented Sep 9, 2021

fn borrowing_sub(self, rhs: Self, carry: bool) -> (Self, bool);

I was very confused by this function name at first, since borrowing in Rust usually refers to references. I am not a native speaker, but I do formal mathematical work in English professionally, and yet I never before heard the term "borrowing" in the context of subtraction. So I think this, at least, needs some explanation in the docs. (I would have expected something like carrying_sub, but maybe that is nonsense for a native speaker.)

The current docs for some of the other methods could probably also be improved: they talk about not having the "ability to overflow", which makes it sound like not overflowing is a bad thing.

@AaronKutch
Copy link
Contributor

The word borrow here comes from the terminology for a full subtractor. I am thinking that maybe the borrowing_sub function could be removed altogether. The same effect that borrowing_sub has can be obtained from carrying_add by making the first carrying_add in the chain have a set carry bit, and then bitnot every rhs. This fact could be put in the documentation of carrying_add.

@clarfonthey
Copy link
Contributor Author

The word borrow here comes from the terminology for a full subtractor. I am thinking that maybe the borrowing_sub function could be removed altogether. The same effect that borrowing_sub has can be obtained from carrying_add by making the first carrying_add in the chain have a set carry bit, and then bitnot every rhs. This fact could be put in the documentation of carrying_add.

Considering how the primary goal of these methods is to be as efficient as possible, usually optimising down to a single instruction, I don't think it'd be reasonable to just get rid of subtraction in favour of telling everyone to use addition instead. Definitely open to changing the name, though.

@AaronKutch
Copy link
Contributor

AaronKutch commented Sep 9, 2021

These helper methods will not be very useful to me unless they are implemented for every kind of integer. Here is an implementation for a widening multiplication-addition for u128:

/// Extended multiply-addition of `(lhs * rhs) + add`. The result is returned as a tuple of the wrapping part and the
/// overflow part. No numerical overflow is possible even if all three arguments are set to their max values.
pub const fn widen_mul_add(lhs: u128, rhs: u128, add: u128) -> (u128, u128) {
    //                       [rhs_hi]  [rhs_lo]
    //                       [lhs_hi]  [lhs_lo]
    //                     X___________________
    //                       [------tmp0------]
    //             [------tmp1------]
    //             [------tmp2------]
    //     [------tmp3------]
    //                       [-------add------]
    // +_______________________________________
    //                       [------sum0------]
    //     [------sum1------]

    let lhs_lo = lhs as u64;
    let rhs_lo = rhs as u64;
    let lhs_hi = (lhs.wrapping_shr(64)) as u64;
    let rhs_hi = (rhs.wrapping_shr(64)) as u64;
    let tmp0 = (lhs_lo as u128).wrapping_mul(rhs_lo as u128);
    let tmp1 = (lhs_lo as u128).wrapping_mul(rhs_hi as u128);
    let tmp2 = (lhs_hi as u128).wrapping_mul(rhs_lo as u128);
    let tmp3 = (lhs_hi as u128).wrapping_mul(rhs_hi as u128);
    // tmp1 and tmp2 straddle the boundary. We have to handle three carries
    let (sum0, carry0) = tmp0.overflowing_add(tmp1.wrapping_shl(64));
    let (sum0, carry1) = sum0.overflowing_add(tmp2.wrapping_shl(64));
    let (sum0, carry2) = sum0.overflowing_add(add as u128);
    let sum1 = tmp3
        .wrapping_add(tmp1.wrapping_shr(64))
        .wrapping_add(tmp2.wrapping_shr(64))
        .wrapping_add(carry0 as u128)
        .wrapping_add(carry1 as u128)
        .wrapping_add(carry2 as u128);
    (sum0, sum1)
}

I have tested this with my crate awint.

edit: There is a version of this that uses the Karatsuba trick to use 3 multiplications instead of 4, but it incurs extra summations, branches, and is not as parallel. For typical desktop processors the above should be the fastest.

@clarfonthey
Copy link
Contributor Author

I would make a PR for that.

@AaronKutch
Copy link
Contributor

Some alternative signatures include u128::widen_mul_add(lhs, rhs, add), lhs.widen_mul_add(rhs, add), or add.widen_mul_add(lhs, rhs). In awint my general purpose mul-add function is mul_add_triop which uses the third signature but takes self mutably and add-assigns lhs * rhs. I'm not sure which is best.

@AaronKutch
Copy link
Contributor

I would also change up the documentation headers for the carrying_add function to say

Extended addition of `self + rhs + carry`. The booleans are interpreted as a single bit
integer of value 0 or 1. If unsigned overflow occurs, then the boolean in the tuple
returns 1. The output carry can be chained into the input carry of another carrying add,
which allows for arbitrarily large additions to be calculated.

I specifically note unsigned overflow, because that happens for both signed and unsigned
integers because of how two's complement works.

@AaronKutch
Copy link
Contributor

borrowing_sub should be left in with its naming, but its documentation could be

Extended subtraction of `self - rhs - borrow`. The "borrowing" here refers to borrowing in the full subtractor sense.
The booleans are interpreted as a single bit integer of value 0 or 1. If unsigned overflow occurs, then the boolean
in the tuple returns 1. The output carry can be chained into the input carry of another borrowing subtract,
which allows for arbitrarily large subtraction to be calculated.

@tspiteri
Copy link
Contributor

tspiteri commented Sep 10, 2021

I specifically note unsigned overflow, because that happens for both signed and unsigned
integers because of how two's complement works.

But unsigned overflow and signed overflow are different. For example, on x86_64, while unsigned and signed integers share addition and subtraction instructions, unsigned overflow is detected using the carry flag while signed overflow is detected using the overflow flag.

As a concrete example: 127i8 + 1 causes signed overflow but not unsigned overflow. So the carry flag should be false/0.

Edit: I think I had misread your comment and thought the middle part of your comment was the current doc, not your suggestion, so it looks like I completely misinterpreted your final comment.

@AaronKutch
Copy link
Contributor

Yes signed and unsigned overflow are different, but the carrying_add as implemented for unsigned and signed integers both use unsigned overflow because of how two's complement carrying works. Someone using i64::carrying_add might think that the carry out bit follows the bit of i64::overflowing_add when in actuality it is following u64::overflowing_add. So in the documentation I would put emphasis on _unsigned_ overflow.

@clarfonthey
Copy link
Contributor Author

clarfonthey commented Sep 14, 2021

I think all of these are good suggestions, and like mentioned earlier, these changes definitely should go in a PR if you have the time. I think one important thing to note is that so far the APIs here seem good, but the documentation definitely could use some work. Although if there's a bigger case for changing the subtraction behaviour to be more in line with what's expected (the existing behaviour is mostly modelled after the x86 instructions adc and sbb), then I'm for that.

That said, the main goal is to make it relatively painless to write correct code that compiles down to the right instructions in release mode, so, I would say we should make sure that happens regardless of what's done. I would have added an explicit test for that but I honestly don't know how.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 4, 2021
…riplett

Add more text and examples to `carrying_{add|mul}`

`feature(bigint_helper_methods)` tracking issue rust-lang#85532

cc `@clarfonthey`
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 4, 2021
…riplett

Add more text and examples to `carrying_{add|mul}`

`feature(bigint_helper_methods)` tracking issue rust-lang#85532

cc ``@clarfonthey``
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 4, 2021
…riplett

Add more text and examples to `carrying_{add|mul}`

`feature(bigint_helper_methods)` tracking issue rust-lang#85532

cc ```@clarfonthey```
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 4, 2021
…riplett

Add more text and examples to `carrying_{add|mul}`

`feature(bigint_helper_methods)` tracking issue rust-lang#85532

cc ````@clarfonthey````
JohnTitor added a commit to JohnTitor/rust that referenced this issue Nov 5, 2021
…riplett

Add more text and examples to `carrying_{add|mul}`

`feature(bigint_helper_methods)` tracking issue rust-lang#85532

cc `````@clarfonthey`````
geky added a commit to geky/gf256 that referenced this issue Nov 8, 2021
Multiplication, and carry-less multiplication, are inherently a widening
operation. Unfortunately, at the time of writing, the types in Rust
don't capture this well, being built around fixed-width wrapping
multiplication.

Rust's stdlib can rely on compiler-level optimizations to clean up
performance issues from unnecessarily-wide multiplications, but this
becomes a bit of an issue for our library, especially for u64 types,
since we rely on intrinsics, which may be hard for compilers to
optimize around.

This commit adds widening_mul, based on a proposal to add widening_mul
to Rust's primitive types:
rust-lang/rust#85532

As well as several other tweaks to how xmul is provided, moving more
arch-level details into xmul, but still limiting when it is emitted.
@TDecking
Copy link
Contributor

It turns out that rustc was able to optimize the current implementation of carrying_add/borrowing_sub perfectly in previous versions,
but it has regressed since. Rust 1.82 added another regression in which the use of these functions inside a standard bignum addition loop
is now worse when compared to a version using architecture instrinsics.

https://godbolt.org/z/hK7M37Y85

@scottmcm
Copy link
Member

Opened #133663 to add an intrinsic for wide_mul and carrying_mul, including adding u128 support for both.

@scottmcm
Copy link
Member

scottmcm commented Nov 30, 2024

is now worse when compared to a version using architecture instrinsics.

Replace || with | and it'll work great again: https://godbolt.org/z/sv5xxGTMr

(Classic problem that || is worse because it's control flow.)

Fixed in #133674

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 1, 2024
…nieu

Fix chaining `carrying_add`s

Something about the MIR lowering for `||` ended up breaking this, but it's fixed by changing the code to use `|` instead.

I also added an assembly test to ensure it *keeps* being [`adc`](https://www.felixcloutier.com/x86/adc).

cc rust-lang#85532 (comment), which noticed this.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 1, 2024
…nieu

Fix chaining `carrying_add`s

Something about the MIR lowering for `||` ended up breaking this, but it's fixed by changing the code to use `|` instead.

I also added an assembly test to ensure it *keeps* being [`adc`](https://www.felixcloutier.com/x86/adc).

cc rust-lang#85532 (comment), which noticed this.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Dec 1, 2024
Rollup merge of rust-lang#133674 - scottmcm:chain-carrying-add, r=Amanieu

Fix chaining `carrying_add`s

Something about the MIR lowering for `||` ended up breaking this, but it's fixed by changing the code to use `|` instead.

I also added an assembly test to ensure it *keeps* being [`adc`](https://www.felixcloutier.com/x86/adc).

cc rust-lang#85532 (comment), which noticed this.
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 27, 2024
Add a compiler intrinsic to back `bigint_helper_methods`

cc rust-lang#85532

This adds a new `carrying_mul_add` intrinsic, to implement `wide_mul` and `carrying_mul`.

It has fallback MIR for all types -- including `u128`, which isn't currently supported on nightly -- so that it'll continue to work on all backends, including CTFE.

Then it's overridden in `cg_llvm` to use wider intermediate types, including `i256` for `u128::carrying_mul`.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 27, 2024
Add a compiler intrinsic to back `bigint_helper_methods`

cc rust-lang#85532

This adds a new `carrying_mul_add` intrinsic, to implement `wide_mul` and `carrying_mul`.

It has fallback MIR for all types -- including `u128`, which isn't currently supported on nightly -- so that it'll continue to work on all backends, including CTFE.

Then it's overridden in `cg_llvm` to use wider intermediate types, including `i256` for `u128::carrying_mul`.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Dec 27, 2024
Rollup merge of rust-lang#133663 - scottmcm:carrying_mul_add, r=Amanieu

Add a compiler intrinsic to back `bigint_helper_methods`

cc rust-lang#85532

This adds a new `carrying_mul_add` intrinsic, to implement `wide_mul` and `carrying_mul`.

It has fallback MIR for all types -- including `u128`, which isn't currently supported on nightly -- so that it'll continue to work on all backends, including CTFE.

Then it's overridden in `cg_llvm` to use wider intermediate types, including `i256` for `u128::carrying_mul`.
poliorcetics pushed a commit to poliorcetics/rust that referenced this issue Dec 28, 2024
Add a compiler intrinsic to back `bigint_helper_methods`

cc rust-lang#85532

This adds a new `carrying_mul_add` intrinsic, to implement `wide_mul` and `carrying_mul`.

It has fallback MIR for all types -- including `u128`, which isn't currently supported on nightly -- so that it'll continue to work on all backends, including CTFE.

Then it's overridden in `cg_llvm` to use wider intermediate types, including `i256` for `u128::carrying_mul`.
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 31, 2024
Tidy up bigint multiplication methods

This tidies up the library version of the bigint multiplication methods after the addition of the intrinsics in rust-lang#133663. It follows [this summary](rust-lang#85532 (comment)) of what's desired for these methods.

Note that, if `2H = N`, then `uH::MAX * uH::MAX + uH::MAX + uH::MAX` is `uN::MAX`, and that we can effectively add two "carry" values without overflowing.

For ease of terminology, the "low-order" or "least significant" or "wrapping" half of multiplication will be called the low part, and the "high-order" or "most significant" or "overflowing" half of multiplication will be called the high part. In all cases, the return convention is `(low, high)` and left unchanged by this PR, to be litigated later.

## API Changes

The original API:

```rust
impl uN {
    // computes self * rhs
    pub const fn widening_mul(self, rhs: uN) -> (uN, uN);

    // computes self * rhs + carry
    pub const fn carrying_mul(self, rhs: uN, carry: uN) -> (uN, uN);
}
```

The added API:

```rust
impl uN {
    // computes self * rhs + carry1 + carry2
    pub const fn carrying2_mul(self, rhs: uN, carry: uN, add: uN) -> (uN, uN);
}
impl iN {
    // note that the low part is unsigned
    pub const fn widening_mul(self, rhs: iN) -> (uN, iN);
    pub const fn carrying_mul(self, rhs: iN, carry: iN) -> (uN, iN);
    pub const fn carrying_mul_add(self, rhs: iN, carry: iN, add: iN) -> (uN, iN);
}
```

Additionally, a naive implementation has been added for `u128` and `i128` since there are no double-wide types for those. Eventually, an intrinsic will be added to make these more efficient, but rather than doing this all at once, the library changes are added first.

## Justifications for API

The unsigned parts are done to ensure consistency with overflowing addition: for a two's complement integer, you want to have unsigned overflow semantics for all parts of the integer except the highest one. This is because overflow for unsigned integers happens on the highest bit (from `MAX` to zero), whereas overflow for signed integers happens on the second highest bit (from `MAX` to `MIN`). Since the sign information only matters in the highest part, we use unsigned overflow for everything but that part.

There is still discussion on the merits of signed bigint *addition* methods, since getting the behaviour right is very subtle, but at least for signed bigint *multiplication*, the sign of the operands does make a difference. So, it feels appropriate that at least until we've nailed down the final API, there should be an option to do signed versions of these methods.

Additionally, while it's unclear whether we need all three versions of bigint multiplication (widening, carrying-1, and carrying-2), since it's possible to have up to two carries without overflow, there should at least be a method to allow that. We could potentially only offer the carry-2 method and expect that adding zero carries afterword will optimise correctly, but again, this can be litigated before stabilisation.

## Note on documentation

While a lot of care was put into the documentation for the `widening_mul` and `carrying_mul` methods on unsigned integers, I have not taken this same care for `carrying_mul_add` or the signed versions. While I have updated the doc tests to be more appropriate, there will likely be many documentation changes done before stabilisation.

## Note on tests

Alongside this change, I've added several tests to ensure that these methods work as expected. These are alongside the codegen tests for the intrinsics.
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Jan 3, 2025
Tidy up bigint multiplication methods

This tidies up the library version of the bigint multiplication methods after the addition of the intrinsics in #133663. It follows [this summary](rust-lang/rust#85532 (comment)) of what's desired for these methods.

Note that, if `2H = N`, then `uH::MAX * uH::MAX + uH::MAX + uH::MAX` is `uN::MAX`, and that we can effectively add two "carry" values without overflowing.

For ease of terminology, the "low-order" or "least significant" or "wrapping" half of multiplication will be called the low part, and the "high-order" or "most significant" or "overflowing" half of multiplication will be called the high part. In all cases, the return convention is `(low, high)` and left unchanged by this PR, to be litigated later.

## API Changes

The original API:

```rust
impl uN {
    // computes self * rhs
    pub const fn widening_mul(self, rhs: uN) -> (uN, uN);

    // computes self * rhs + carry
    pub const fn carrying_mul(self, rhs: uN, carry: uN) -> (uN, uN);
}
```

The added API:

```rust
impl uN {
    // computes self * rhs + carry1 + carry2
    pub const fn carrying2_mul(self, rhs: uN, carry: uN, add: uN) -> (uN, uN);
}
impl iN {
    // note that the low part is unsigned
    pub const fn widening_mul(self, rhs: iN) -> (uN, iN);
    pub const fn carrying_mul(self, rhs: iN, carry: iN) -> (uN, iN);
    pub const fn carrying_mul_add(self, rhs: iN, carry: iN, add: iN) -> (uN, iN);
}
```

Additionally, a naive implementation has been added for `u128` and `i128` since there are no double-wide types for those. Eventually, an intrinsic will be added to make these more efficient, but rather than doing this all at once, the library changes are added first.

## Justifications for API

The unsigned parts are done to ensure consistency with overflowing addition: for a two's complement integer, you want to have unsigned overflow semantics for all parts of the integer except the highest one. This is because overflow for unsigned integers happens on the highest bit (from `MAX` to zero), whereas overflow for signed integers happens on the second highest bit (from `MAX` to `MIN`). Since the sign information only matters in the highest part, we use unsigned overflow for everything but that part.

There is still discussion on the merits of signed bigint *addition* methods, since getting the behaviour right is very subtle, but at least for signed bigint *multiplication*, the sign of the operands does make a difference. So, it feels appropriate that at least until we've nailed down the final API, there should be an option to do signed versions of these methods.

Additionally, while it's unclear whether we need all three versions of bigint multiplication (widening, carrying-1, and carrying-2), since it's possible to have up to two carries without overflow, there should at least be a method to allow that. We could potentially only offer the carry-2 method and expect that adding zero carries afterword will optimise correctly, but again, this can be litigated before stabilisation.

## Note on documentation

While a lot of care was put into the documentation for the `widening_mul` and `carrying_mul` methods on unsigned integers, I have not taken this same care for `carrying_mul_add` or the signed versions. While I have updated the doc tests to be more appropriate, there will likely be many documentation changes done before stabilisation.

## Note on tests

Alongside this change, I've added several tests to ensure that these methods work as expected. These are alongside the codegen tests for the intrinsics.
@scottmcm
Copy link
Member

A nice thing I noticed while writing the demo in #135750:

Manually writing naïve quadratic multiplication of a u128 with u64::carrying_mul_add

#[no_mangle]
pub fn quadratic_mul(a: u128, b: u128) -> u128 {
    const N: usize = 2;
    let a: [u64; N] = unsafe { std::mem::transmute(a) };
    let b: [u64; N] = unsafe { std::mem::transmute(b) };
    let mut out = [0; N];
    for j in 0..N {
        let mut carry = 0;
        for i in 0..(N - j) {
            (out[j + i], carry) = u64::carrying_mul_add(a[i], b[j], out[j + i], carry);
        }
    }
    unsafe { std::mem::transmute(out) }
}

gives this assembly:

quadratic_mul:
        mov     r8, rdx
        mov     rax, rdx
        mul     rdi
        imul    r8, rsi
        add     rdx, r8
        imul    rcx, rdi
        add     rdx, rcx
        ret

Which is essentially identical to what you get from using u128 multiplication directly

ordinary_mul_128:
        mov     r8, rdx
        mov     rax, rdx
        mul     rdi
        imul    rsi, r8
        add     rdx, rsi
        imul    rcx, rdi
        add     rdx, rcx
        ret

https://rust.godbolt.org/z/xEqajMc8b

@programmerjake
Copy link
Member

Which is essentially identical to what you get from using u128 multiplication directly

that's because LLVM expands wide multiplications into a bunch of word-sized operations using nearly the same algorithm: https://github.com/llvm/llvm-project/blob/84c89d0aa4beff4a4d6c36eda125278c48e41128/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp#L10822

@scottmcm
Copy link
Member

It absolutely makes sense, I'm just glad to see that we're emitting something good enough that the extra steps optimize away.

jhpratt added a commit to jhpratt/rust that referenced this issue Jan 21, 2025
Add an example of using `carrying_mul_add` to write wider multiplication

Just the basic quadratic version that you wouldn't actually use for really-big integers, but it's nice and short so is useful as for a demonstration of why you might find `carrying_mul_add` useful :)

cc rust-lang#85532 `@clarfonthey`
jieyouxu added a commit to jieyouxu/rust that referenced this issue Jan 21, 2025
Add an example of using `carrying_mul_add` to write wider multiplication

Just the basic quadratic version that you wouldn't actually use for really-big integers, but it's nice and short so is useful as for a demonstration of why you might find `carrying_mul_add` useful :)

cc rust-lang#85532 ``@clarfonthey``
jieyouxu added a commit to jieyouxu/rust that referenced this issue Jan 21, 2025
Add an example of using `carrying_mul_add` to write wider multiplication

Just the basic quadratic version that you wouldn't actually use for really-big integers, but it's nice and short so is useful as for a demonstration of why you might find `carrying_mul_add` useful :)

cc rust-lang#85532 ```@clarfonthey```
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 21, 2025
Add an example of using `carrying_mul_add` to write wider multiplication

Just the basic quadratic version that you wouldn't actually use for really-big integers, but it's nice and short so is useful as for a demonstration of why you might find `carrying_mul_add` useful :)

cc rust-lang#85532 ````@clarfonthey````
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 21, 2025
Add an example of using `carrying_mul_add` to write wider multiplication

Just the basic quadratic version that you wouldn't actually use for really-big integers, but it's nice and short so is useful as for a demonstration of why you might find `carrying_mul_add` useful :)

cc rust-lang#85532 `````@clarfonthey`````
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 21, 2025
Add an example of using `carrying_mul_add` to write wider multiplication

Just the basic quadratic version that you wouldn't actually use for really-big integers, but it's nice and short so is useful as for a demonstration of why you might find `carrying_mul_add` useful :)

cc rust-lang#85532 ``````@clarfonthey``````
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jan 22, 2025
Rollup merge of rust-lang#135750 - scottmcm:cma-example, r=cuviper

Add an example of using `carrying_mul_add` to write wider multiplication

Just the basic quadratic version that you wouldn't actually use for really-big integers, but it's nice and short so is useful as for a demonstration of why you might find `carrying_mul_add` useful :)

cc rust-lang#85532 ``````@clarfonthey``````
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests