Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqrt is slower than expected #134

Open
jorisguex opened this issue Aug 3, 2024 · 2 comments
Open

sqrt is slower than expected #134

jorisguex opened this issue Aug 3, 2024 · 2 comments

Comments

@jorisguex
Copy link

jorisguex commented Aug 3, 2024

I was benchmarking the sqrt function and I noticed that is was slower than I expected:

fn main() {
    let val = bigdecimal::BigDecimal::new(2.into(), 0);
    for prec in [100, 1000, 10000, 100000] {
        let ctx = bigdecimal::Context::default().with_precision(prec.try_into().unwrap());
        let start = std::time::Instant::now();
        let _ = val.sqrt_with_context(&ctx);
        let elapsed = start.elapsed().as_secs_f32();
        println!("Precision = {prec}: {:.5} seconds ({} microseconds per digit)", elapsed, elapsed*1000000.0/prec as f32);
    }
}
> cargo run --release 2>/dev/null
Precision = 100: 0.00047 seconds (4.67834 microseconds per digit)
Precision = 1000: 0.00512 seconds (5.124834 microseconds per digit)
Precision = 10000: 0.45796 seconds (45.79616 microseconds per digit)
Precision = 100000: 50.74217 seconds (507.4217 microseconds per digit)

Compare that to Python:

import decimal
import time
for prec in [10, 100, 1000, 10000, 100000]:
    decimal.getcontext().prec = prec
    start = time.perf_counter()
    decimal.Decimal(2).sqrt()
    elapsed = time.perf_counter() - start
    print(f"Precision = {prec}: {elapsed:f} seconds ({elapsed/prec*1000000:f} microseconds per digit)")
> python3 sqrt.py
Precision = 10: 0.000009 seconds (0.866700 microseconds per digit)
Precision = 100: 0.000015 seconds (0.154170 microseconds per digit)
Precision = 1000: 0.000644 seconds (0.644208 microseconds per digit)
Precision = 10000: 0.074025 seconds (7.402471 microseconds per digit)
Precision = 100000: 0.809923 seconds (8.099229 microseconds per digit)
@akubera
Copy link
Owner

akubera commented Aug 3, 2024

I don't think that will change until I rewrite the "backend" digit code with a custom BigInt, rather than using num-bigint::BigInt, which doesn't offer much in terms of direct access to the digits, so there's lots of cloning involved.

If you're interested you could make a flamegraph of the calls involved and maybe there's some low hanging fruit that I can pick -- but I have doubts, and wont have much time to dedicate until later this month.

@jorisguex
Copy link
Author

jorisguex commented Aug 4, 2024

Hi, thanks for your response. I have done a flamegraph and ~28% of the samples are in num_bigint::biguint::multiplication::scalar_mul while ~72% are in num_bigint::biguint::division::div_rem_ref.

flamegraph

For anyone interested, I have made a fork in the meantime that speeds up sqrt by ~100x: trunk...jorisguex:bigdecimal-rs:make-sqrt-faster. It passes all of the unit tests but there may be specific edge cases where it fails (I haven't done that much testing). It also has the benefit of rounding correctly.

> cargo run --release 2>/dev/null
Precision = 100: 0.00032 seconds (3.1970801 microseconds per digit)
Precision = 1000: 0.00056 seconds (0.55525 microseconds per digit)
Precision = 10000: 0.00428 seconds (0.4282583 microseconds per digit)
Precision = 100000: 0.49238 seconds (4.923846 microseconds per digit)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants