Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rearrange the pipeline of pow to gain efficiency #74367

Merged
merged 1 commit into from
Jul 24, 2020
Merged

Rearrange the pipeline of pow to gain efficiency #74367

merged 1 commit into from
Jul 24, 2020

Conversation

Neutron3529
Copy link
Contributor

The check of the exp parameter seems useless if we execute the while-loop more than once.
The original implementation of pow function using one more comparison if the exp==0 and may break the pipeline of the cpu, which may generate a slower code.
The performance gap between the old and the new implementation may be small, but IMO, at least the newer one looks more beautiful.


bench prog:

#![feature(test)]
extern crate test;
#[macro_export]macro_rules! timing{
($a:expr)=>{let time=std::time::Instant::now();{$a;}print!("{:?} ",time.elapsed())};
($a:expr,$b:literal)=>{let time=std::time::Instant::now();let mut a=0;for _ in 0..$b{a^=$a;}print!("{:?} {} ",time.elapsed(),a)}
}
#[inline]
pub fn pow_rust(x:i64, mut exp: u32) -> i64 {
    let mut base = x;
    let mut acc = 1;
    while exp > 1 {
        if (exp & 1) == 1 {
            acc = acc * base;
        }
        exp /= 2;
        base = base * base;
    }
    if exp == 1 {
        acc = acc * base;
    }
    acc
}
#[inline]
pub fn pow_new(x:i64, mut exp: u32) -> i64 {
    if exp==0{
        1
    }else{
        let mut base = x;
        let mut acc = 1;
        while exp > 1 {
            if (exp & 1) == 1 {
                acc = acc * base;
            }
            exp >>= 1;
            base = base * base;
        }
        acc * base
    }
}

fn main(){
let a=2i64;
let b=1_u32;
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
}

bench in my laptop:

neutron@Neutron:/me/rust$ rc commit.rs
rustc commit.rs  && ./commit

3.978419716s 0 4.079765171s 0 3.964630622s 0 
3.997127013s 0 4.260304804s 0 3.997638211s 0 
3.963195544s 0 4.11657718s 0 4.176054164s 0 
3.830128579s 0 3.980396122s 0 3.937258567s 0 
3.986055948s 0 4.127804162s 0 4.018943411s 0 
4.185568857s 0 4.217512517s 0 3.98313603s 0 
3.863018225s 0 4.030447988s 0 3.694878237s 0 
4.206987927s 0 4.137608047s 0 4.115564664s 0 
neutron@Neutron:/me/rust$ rc commit.rs -O
rustc commit.rs -O && ./commit

162.111993ms 0 165.107125ms 0 166.26924ms 0 
175.20479ms 0 205.062565ms 0 176.278791ms 0 
174.408975ms 0 166.526899ms 0 201.857604ms 0 
146.190062ms 0 168.592821ms 0 154.61411ms 0 
199.678912ms 0 168.411598ms 0 162.129996ms 0 
147.420765ms 0 209.759326ms 0 154.807907ms 0 
165.507134ms 0 188.476239ms 0 157.351524ms 0 
121.320123ms 0 126.401229ms 0 114.86428ms 0 

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @withoutboats (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 15, 2020
@pickfire
Copy link
Contributor

The pow in musl looks interesting not needing to use a while loop https://github.com/EOSIO/musl/blob/eosio/src/math/pow.c

@Neutron3529
Copy link
Contributor Author

The pow in musl looks interesting not needing to use a while loop https://github.com/EOSIO/musl/blob/eosio/src/math/pow.c

That pow is a conversion of the equation a^b=exp(b*log(a)), since the b should be small when a is an integer(or otherwise a overflow may take place), it is not very necessary using exp or log to calculate the pow.

In addition, the pow of i64 or u64 cannot be calculate by f64 (will loss accuracy), and the only implementation of wrapping_mul should be the one implement here.

@Neutron3529 Neutron3529 marked this pull request as draft July 16, 2020 07:01
@Neutron3529 Neutron3529 marked this pull request as ready for review July 16, 2020 09:38
@nagisa
Copy link
Member

nagisa commented Jul 18, 2020

It took me a while to understand the benchmark results, but from what I can tell the results are all awfully close to being within error margin? Can the benchmarks be done with something like e.g. criterion so that we get more information than just the sum?

@Neutron3529
Copy link
Contributor Author

It took me a while to understand the benchmark results, but from what I can tell the results are all awfully close to being within error margin? Can the benchmarks be done with something like e.g. criterion so that we get more information than just the sum?

new bench is here.
results here are quite close since I'd only done a really small modification

I guess the newer program may have a better pipeline, and the benchmark result seems to prove that.

BTW, I don't know how to set a criterion, but there are someone strongly suggests me to using a bencher.

neutron@Neutron:/me/rust/bench$ cargo bench
   Compiling bench v0.1.0 (/me/rust/bench)
    Finished bench [optimized] target(s) in 1.55s
     Running target/release/deps/bench-25133398f204ef70

running 6 tests
test tests::new_xpow0  ... bench:           1 ns/iter (+/- 0)
test tests::new_xpow1  ... bench:           1 ns/iter (+/- 0)
test tests::new_xpow3  ... bench:           2 ns/iter (+/- 1)
test tests::rust_xpow0 ... bench:           1 ns/iter (+/- 0)
test tests::rust_xpow1 ... bench:           1 ns/iter (+/- 0)
test tests::rust_xpow3 ... bench:           2 ns/iter (+/- 1)

test result: ok. 0 passed; 0 failed; 0 ignored; 6 measured; 0 filtered out

neutron@Neutron:/me/rust/bench$ cargo bench
   Compiling bench v0.1.0 (/me/rust/bench)
    Finished bench [optimized] target(s) in 0.66s
     Running target/release/deps/bench-25133398f204ef70

running 8 tests
test tests::new_xpow0     ... bench:           1 ns/iter (+/- 0)
test tests::new_xpow1     ... bench:           1 ns/iter (+/- 0)
test tests::new_xpow3     ... bench:           1 ns/iter (+/- 0)
test tests::new_xpow3123  ... bench:           9 ns/iter (+/- 2)
test tests::rust_xpow0    ... bench:           1 ns/iter (+/- 0)
test tests::rust_xpow1    ... bench:           1 ns/iter (+/- 0)
test tests::rust_xpow3    ... bench:           2 ns/iter (+/- 0)
test tests::rust_xpow3123 ... bench:           9 ns/iter (+/- 2)

test result: ok. 0 passed; 0 failed; 0 ignored; 8 measured; 0 filtered out

neutron@Neutron:/me/rust/bench$ cargo bench
   Compiling bench v0.1.0 (/me/rust/bench)
    Finished bench [optimized] target(s) in 0.80s
     Running target/release/deps/bench-25133398f204ef70

running 10 tests
test tests::new_xpow0     ... bench:           1 ns/iter (+/- 0)
test tests::new_xpow1     ... bench:           0 ns/iter (+/- 0)
test tests::new_xpow3     ... bench:           2 ns/iter (+/- 1)
test tests::new_xpow3123  ... bench:          10 ns/iter (+/- 3)
test tests::new_xpow7     ... bench:           2 ns/iter (+/- 0)
test tests::rust_xpow0    ... bench:           1 ns/iter (+/- 0)
test tests::rust_xpow1    ... bench:           1 ns/iter (+/- 0)
test tests::rust_xpow3    ... bench:           2 ns/iter (+/- 0)
test tests::rust_xpow3123 ... bench:          10 ns/iter (+/- 2)
test tests::rust_xpow7    ... bench:           3 ns/iter (+/- 2)

test result: ok. 0 passed; 0 failed; 0 ignored; 10 measured; 0 filtered out

neutron@Neutron:/me/rust/bench$ cargo bench
    Finished bench [optimized] target(s) in 0.00s
     Running target/release/deps/bench-25133398f204ef70

running 10 tests
test tests::new_xpow0     ... bench:           1 ns/iter (+/- 0)
test tests::new_xpow1     ... bench:           1 ns/iter (+/- 0)
test tests::new_xpow3     ... bench:           2 ns/iter (+/- 1)
test tests::new_xpow3123  ... bench:           9 ns/iter (+/- 2)
test tests::new_xpow7     ... bench:           2 ns/iter (+/- 0)
test tests::rust_xpow0    ... bench:           1 ns/iter (+/- 0)
test tests::rust_xpow1    ... bench:           1 ns/iter (+/- 0)
test tests::rust_xpow3    ... bench:           2 ns/iter (+/- 0)
test tests::rust_xpow3123 ... bench:          10 ns/iter (+/- 4)
test tests::rust_xpow7    ... bench:           3 ns/iter (+/- 1)

test result: ok. 0 passed; 0 failed; 0 ignored; 10 measured; 0 filtered out

lib.rs:

#![feature(test)]
extern crate test;
#[inline]
pub fn pow_rust(x:i64, mut exp: u32) -> i64 {
    let mut base = x;
    let mut acc = 1;
    while exp > 1 {
        if (exp & 1) == 1 {
            acc = acc * base;
        }
        exp /= 2;
        base = base * base;
    }
    if exp == 1 {
        acc = acc * base;
    }
    acc
}
#[inline]
pub fn pow_new(x:i64, mut exp: u32) -> i64 {
    if exp==0{
        1
    }else{
        let mut base = x;
        let mut acc = 1;
        while exp > 1 {
            if (exp & 1) == 1 {
                acc = acc * base;
            }
            exp >>= 1;
            base = base * base;
        }
        acc * base
    }
}

#[cfg(test)]
mod tests {
    use super::*;
    use test::Bencher;

    #[bench]
    fn rust_xpow0(b: &mut Bencher) {
        b.iter(|| pow_rust(test::black_box(125233),test::black_box(0)));
    }
    #[bench]
    fn new_xpow0(b: &mut Bencher) {
        b.iter(|| pow_new(test::black_box(125233),test::black_box(0)));
    }

    #[bench]
    fn rust_xpow1(b: &mut Bencher) {
        b.iter(|| pow_rust(test::black_box(125233),test::black_box(1)));
    }
    #[bench]
    fn new_xpow1(b: &mut Bencher) {
        b.iter(|| pow_new(test::black_box(125233),test::black_box(1)));
    }

    #[bench]
    fn rust_xpow3(b: &mut Bencher) {
        b.iter(|| pow_rust(test::black_box(125233),test::black_box(3)));
    }
    #[bench]
    fn new_xpow3(b: &mut Bencher) {
        b.iter(|| pow_new(test::black_box(125233),test::black_box(3)));
    }
    #[bench]
    fn rust_xpow7(b: &mut Bencher) {
        b.iter(|| pow_rust(test::black_box(125233),test::black_box(7)));
    }
    #[bench]
    fn new_xpow7(b: &mut Bencher) {
        b.iter(|| pow_new(test::black_box(125233),test::black_box(7)));
    }
    #[bench]
    fn rust_xpow3123(b: &mut Bencher) {
        b.iter(|| pow_rust(test::black_box(125233),test::black_box(3123)));
    }
    #[bench]
    fn new_xpow3123(b: &mut Bencher) {
        b.iter(|| pow_new(test::black_box(125233),test::black_box(3123)));
    }
}

@nagisa
Copy link
Member

nagisa commented Jul 22, 2020

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Jul 22, 2020

📌 Commit 364cacb has been approved by nagisa

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 22, 2020
@tesuji
Copy link
Contributor

tesuji commented Jul 22, 2020

It would be nice to squash related commits.

@nagisa
Copy link
Member

nagisa commented Jul 22, 2020

@bors r-, yeah, please squash.

Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 22, 2020
Rearrange the pipeline of `pow` to gain efficiency

The check of the `exp` parameter seems useless if we execute the while-loop more than once.
The original implementation of `pow` function using one more comparison if the `exp==0` and may break the pipeline of the cpu, which may generate a slower code.
The performance gap between the old and the new implementation may be small, but IMO, at least the newer one looks more beautiful.

---

bench prog:
```
#![feature(test)]
extern crate test;
#[macro_export]macro_rules! timing{
($a:expr)=>{let time=std::time::Instant::now();{$a;}print!("{:?} ",time.elapsed())};
($a:expr,$b:literal)=>{let time=std::time::Instant::now();let mut a=0;for _ in 0..$b{a^=$a;}print!("{:?} {} ",time.elapsed(),a)}
}
#[inline]
pub fn pow_rust(x:i64, mut exp: u32) -> i64 {
    let mut base = x;
    let mut acc = 1;
    while exp > 1 {
        if (exp & 1) == 1 {
            acc = acc * base;
        }
        exp /= 2;
        base = base * base;
    }
    if exp == 1 {
        acc = acc * base;
    }
    acc
}
#[inline]
pub fn pow_new(x:i64, mut exp: u32) -> i64 {
    if exp==0{
        1
    }else{
        let mut base = x;
        let mut acc = 1;
        while exp > 1 {
            if (exp & 1) == 1 {
                acc = acc * base;
            }
            exp >>= 1;
            base = base * base;
        }
        acc * base
    }
}

fn main(){
let a=2i64;
let b=1_u32;
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
}
```
bench in my laptop:
```
neutron@Neutron:/me/rust$ rc commit.rs
rustc commit.rs  && ./commit

3.978419716s 0 4.079765171s 0 3.964630622s 0
3.997127013s 0 4.260304804s 0 3.997638211s 0
3.963195544s 0 4.11657718s 0 4.176054164s 0
3.830128579s 0 3.980396122s 0 3.937258567s 0
3.986055948s 0 4.127804162s 0 4.018943411s 0
4.185568857s 0 4.217512517s 0 3.98313603s 0
3.863018225s 0 4.030447988s 0 3.694878237s 0
4.206987927s 0 4.137608047s 0 4.115564664s 0
neutron@Neutron:/me/rust$ rc commit.rs -O
rustc commit.rs -O && ./commit

162.111993ms 0 165.107125ms 0 166.26924ms 0
175.20479ms 0 205.062565ms 0 176.278791ms 0
174.408975ms 0 166.526899ms 0 201.857604ms 0
146.190062ms 0 168.592821ms 0 154.61411ms 0
199.678912ms 0 168.411598ms 0 162.129996ms 0
147.420765ms 0 209.759326ms 0 154.807907ms 0
165.507134ms 0 188.476239ms 0 157.351524ms 0
121.320123ms 0 126.401229ms 0 114.86428ms 0
```
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 22, 2020
Rearrange the pipeline of `pow` to gain efficiency

The check of the `exp` parameter seems useless if we execute the while-loop more than once.
The original implementation of `pow` function using one more comparison if the `exp==0` and may break the pipeline of the cpu, which may generate a slower code.
The performance gap between the old and the new implementation may be small, but IMO, at least the newer one looks more beautiful.

---

bench prog:
```
#![feature(test)]
extern crate test;
#[macro_export]macro_rules! timing{
($a:expr)=>{let time=std::time::Instant::now();{$a;}print!("{:?} ",time.elapsed())};
($a:expr,$b:literal)=>{let time=std::time::Instant::now();let mut a=0;for _ in 0..$b{a^=$a;}print!("{:?} {} ",time.elapsed(),a)}
}
#[inline]
pub fn pow_rust(x:i64, mut exp: u32) -> i64 {
    let mut base = x;
    let mut acc = 1;
    while exp > 1 {
        if (exp & 1) == 1 {
            acc = acc * base;
        }
        exp /= 2;
        base = base * base;
    }
    if exp == 1 {
        acc = acc * base;
    }
    acc
}
#[inline]
pub fn pow_new(x:i64, mut exp: u32) -> i64 {
    if exp==0{
        1
    }else{
        let mut base = x;
        let mut acc = 1;
        while exp > 1 {
            if (exp & 1) == 1 {
                acc = acc * base;
            }
            exp >>= 1;
            base = base * base;
        }
        acc * base
    }
}

fn main(){
let a=2i64;
let b=1_u32;
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
}
```
bench in my laptop:
```
neutron@Neutron:/me/rust$ rc commit.rs
rustc commit.rs  && ./commit

3.978419716s 0 4.079765171s 0 3.964630622s 0
3.997127013s 0 4.260304804s 0 3.997638211s 0
3.963195544s 0 4.11657718s 0 4.176054164s 0
3.830128579s 0 3.980396122s 0 3.937258567s 0
3.986055948s 0 4.127804162s 0 4.018943411s 0
4.185568857s 0 4.217512517s 0 3.98313603s 0
3.863018225s 0 4.030447988s 0 3.694878237s 0
4.206987927s 0 4.137608047s 0 4.115564664s 0
neutron@Neutron:/me/rust$ rc commit.rs -O
rustc commit.rs -O && ./commit

162.111993ms 0 165.107125ms 0 166.26924ms 0
175.20479ms 0 205.062565ms 0 176.278791ms 0
174.408975ms 0 166.526899ms 0 201.857604ms 0
146.190062ms 0 168.592821ms 0 154.61411ms 0
199.678912ms 0 168.411598ms 0 162.129996ms 0
147.420765ms 0 209.759326ms 0 154.807907ms 0
165.507134ms 0 188.476239ms 0 157.351524ms 0
121.320123ms 0 126.401229ms 0 114.86428ms 0
```
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 22, 2020
Rearrange the pipeline of `pow` to gain efficiency

The check of the `exp` parameter seems useless if we execute the while-loop more than once.
The original implementation of `pow` function using one more comparison if the `exp==0` and may break the pipeline of the cpu, which may generate a slower code.
The performance gap between the old and the new implementation may be small, but IMO, at least the newer one looks more beautiful.

---

bench prog:
```
#![feature(test)]
extern crate test;
#[macro_export]macro_rules! timing{
($a:expr)=>{let time=std::time::Instant::now();{$a;}print!("{:?} ",time.elapsed())};
($a:expr,$b:literal)=>{let time=std::time::Instant::now();let mut a=0;for _ in 0..$b{a^=$a;}print!("{:?} {} ",time.elapsed(),a)}
}
#[inline]
pub fn pow_rust(x:i64, mut exp: u32) -> i64 {
    let mut base = x;
    let mut acc = 1;
    while exp > 1 {
        if (exp & 1) == 1 {
            acc = acc * base;
        }
        exp /= 2;
        base = base * base;
    }
    if exp == 1 {
        acc = acc * base;
    }
    acc
}
#[inline]
pub fn pow_new(x:i64, mut exp: u32) -> i64 {
    if exp==0{
        1
    }else{
        let mut base = x;
        let mut acc = 1;
        while exp > 1 {
            if (exp & 1) == 1 {
                acc = acc * base;
            }
            exp >>= 1;
            base = base * base;
        }
        acc * base
    }
}

fn main(){
let a=2i64;
let b=1_u32;
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
}
```
bench in my laptop:
```
neutron@Neutron:/me/rust$ rc commit.rs
rustc commit.rs  && ./commit

3.978419716s 0 4.079765171s 0 3.964630622s 0
3.997127013s 0 4.260304804s 0 3.997638211s 0
3.963195544s 0 4.11657718s 0 4.176054164s 0
3.830128579s 0 3.980396122s 0 3.937258567s 0
3.986055948s 0 4.127804162s 0 4.018943411s 0
4.185568857s 0 4.217512517s 0 3.98313603s 0
3.863018225s 0 4.030447988s 0 3.694878237s 0
4.206987927s 0 4.137608047s 0 4.115564664s 0
neutron@Neutron:/me/rust$ rc commit.rs -O
rustc commit.rs -O && ./commit

162.111993ms 0 165.107125ms 0 166.26924ms 0
175.20479ms 0 205.062565ms 0 176.278791ms 0
174.408975ms 0 166.526899ms 0 201.857604ms 0
146.190062ms 0 168.592821ms 0 154.61411ms 0
199.678912ms 0 168.411598ms 0 162.129996ms 0
147.420765ms 0 209.759326ms 0 154.807907ms 0
165.507134ms 0 188.476239ms 0 157.351524ms 0
121.320123ms 0 126.401229ms 0 114.86428ms 0
```
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 22, 2020
Rearrange the pipeline of `pow` to gain efficiency

The check of the `exp` parameter seems useless if we execute the while-loop more than once.
The original implementation of `pow` function using one more comparison if the `exp==0` and may break the pipeline of the cpu, which may generate a slower code.
The performance gap between the old and the new implementation may be small, but IMO, at least the newer one looks more beautiful.

---

bench prog:
```
#![feature(test)]
extern crate test;
#[macro_export]macro_rules! timing{
($a:expr)=>{let time=std::time::Instant::now();{$a;}print!("{:?} ",time.elapsed())};
($a:expr,$b:literal)=>{let time=std::time::Instant::now();let mut a=0;for _ in 0..$b{a^=$a;}print!("{:?} {} ",time.elapsed(),a)}
}
#[inline]
pub fn pow_rust(x:i64, mut exp: u32) -> i64 {
    let mut base = x;
    let mut acc = 1;
    while exp > 1 {
        if (exp & 1) == 1 {
            acc = acc * base;
        }
        exp /= 2;
        base = base * base;
    }
    if exp == 1 {
        acc = acc * base;
    }
    acc
}
#[inline]
pub fn pow_new(x:i64, mut exp: u32) -> i64 {
    if exp==0{
        1
    }else{
        let mut base = x;
        let mut acc = 1;
        while exp > 1 {
            if (exp & 1) == 1 {
                acc = acc * base;
            }
            exp >>= 1;
            base = base * base;
        }
        acc * base
    }
}

fn main(){
let a=2i64;
let b=1_u32;
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
}
```
bench in my laptop:
```
neutron@Neutron:/me/rust$ rc commit.rs
rustc commit.rs  && ./commit

3.978419716s 0 4.079765171s 0 3.964630622s 0
3.997127013s 0 4.260304804s 0 3.997638211s 0
3.963195544s 0 4.11657718s 0 4.176054164s 0
3.830128579s 0 3.980396122s 0 3.937258567s 0
3.986055948s 0 4.127804162s 0 4.018943411s 0
4.185568857s 0 4.217512517s 0 3.98313603s 0
3.863018225s 0 4.030447988s 0 3.694878237s 0
4.206987927s 0 4.137608047s 0 4.115564664s 0
neutron@Neutron:/me/rust$ rc commit.rs -O
rustc commit.rs -O && ./commit

162.111993ms 0 165.107125ms 0 166.26924ms 0
175.20479ms 0 205.062565ms 0 176.278791ms 0
174.408975ms 0 166.526899ms 0 201.857604ms 0
146.190062ms 0 168.592821ms 0 154.61411ms 0
199.678912ms 0 168.411598ms 0 162.129996ms 0
147.420765ms 0 209.759326ms 0 154.807907ms 0
165.507134ms 0 188.476239ms 0 157.351524ms 0
121.320123ms 0 126.401229ms 0 114.86428ms 0
```
@Manishearth
Copy link
Member

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jul 22, 2020
@Neutron3529
Copy link
Contributor Author

Neutron3529 commented Jul 23, 2020

It would be nice to squash related commits.

I find a S-waiting-on-author label here, but I actually don't know how I could help.

Actually, I don't know what is the meaning of squash(Sorry about that.)

The commits I did is just modifying 8 different pow function(4 for int and 4 for uint), and then write a simple test.

What I could do?

Are there some example of squash related commits?

@tesuji
Copy link
Contributor

tesuji commented Jul 23, 2020

Sorry. I didn't consider your git skills. Hope this article helps: https://davidwalsh.name/squash-commits-git .
Feel free to ask if you don't follow.

@Neutron3529 Neutron3529 marked this pull request as draft July 23, 2020 06:53
The check of the `exp` parameter seems useless if we execute the while-loop more than once.
The original implementation of `pow` function using one more comparison if the `exp==0` and may break the pipeline of the cpu, which may generate a slower code.
The performance gap between the old and the new implementation may be small, but IMO, at least the newer one looks more beautiful.

---

bench prog:
```
extern crate test;
($a:expr)=>{let time=std::time::Instant::now();{$a;}print!("{:?} ",time.elapsed())};
($a:expr,$b:literal)=>{let time=std::time::Instant::now();let mut a=0;for _ in 0..$b{a^=$a;}print!("{:?} {} ",time.elapsed(),a)}
}
pub fn pow_rust(x:i64, mut exp: u32) -> i64 {
    let mut base = x;
    let mut acc = 1;
    while exp > 1 {
        if (exp & 1) == 1 {
            acc = acc * base;
        }
        exp /= 2;
        base = base * base;
    }
    if exp == 1 {
        acc = acc * base;
    }
    acc
}
pub fn pow_new(x:i64, mut exp: u32) -> i64 {
    if exp==0{
        1
    }else{
        let mut base = x;
        let mut acc = 1;
        while exp > 1 {
            if (exp & 1) == 1 {
                acc = acc * base;
            }
            exp >>= 1;
            base = base * base;
        }
        acc * base
    }
}

fn main(){
let a=2i64;
let b=1_u32;
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
}
```
bench in my laptop:
```
neutron@Neutron:/me/rust$ rc commit.rs
rustc commit.rs  && ./commit

3.978419716s 0 4.079765171s 0 3.964630622s 0
3.997127013s 0 4.260304804s 0 3.997638211s 0
3.963195544s 0 4.11657718s 0 4.176054164s 0
3.830128579s 0 3.980396122s 0 3.937258567s 0
3.986055948s 0 4.127804162s 0 4.018943411s 0
4.185568857s 0 4.217512517s 0 3.98313603s 0
3.863018225s 0 4.030447988s 0 3.694878237s 0
4.206987927s 0 4.137608047s 0 4.115564664s 0
neutron@Neutron:/me/rust$ rc commit.rs -O
rustc commit.rs -O && ./commit

162.111993ms 0 165.107125ms 0 166.26924ms 0
175.20479ms 0 205.062565ms 0 176.278791ms 0
174.408975ms 0 166.526899ms 0 201.857604ms 0
146.190062ms 0 168.592821ms 0 154.61411ms 0
199.678912ms 0 168.411598ms 0 162.129996ms 0
147.420765ms 0 209.759326ms 0 154.807907ms 0
165.507134ms 0 188.476239ms 0 157.351524ms 0
121.320123ms 0 126.401229ms 0 114.86428ms 0
```

delete an unnecessary semicolon...

Sorry for the typo.

delete trailing whitespace

Sorry, too..

Sorry for the missing...

I checked all the implementations, and finally found that there is one function that does not check whether `exp == 0`

add extra tests

add extra tests.

finished adding the extra tests to prevent further typo

add pow(2) to negative exp

add whitespace.

add whitespace

add whitespace

delete extra line
@Neutron3529 Neutron3529 marked this pull request as ready for review July 23, 2020 06:57
@Neutron3529
Copy link
Contributor Author

Sorry. I didn't consider your git skills. Hope this article helps: https://davidwalsh.name/squash-commits-git .
Feel free to ask if you don't follow.

Thanks a lot for the example you mentioned.

all the commits are squashed now.

@Neutron3529
Copy link
Contributor Author

@bors all commits are squashed.

@Mark-Simulacrum
Copy link
Member

@bors r=nagisa

@bors
Copy link
Contributor

bors commented Jul 24, 2020

📌 Commit ef74e50 has been approved by nagisa

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 24, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Jul 24, 2020
Rearrange the pipeline of `pow` to gain efficiency

The check of the `exp` parameter seems useless if we execute the while-loop more than once.
The original implementation of `pow` function using one more comparison if the `exp==0` and may break the pipeline of the cpu, which may generate a slower code.
The performance gap between the old and the new implementation may be small, but IMO, at least the newer one looks more beautiful.

---

bench prog:
```
#![feature(test)]
extern crate test;
#[macro_export]macro_rules! timing{
($a:expr)=>{let time=std::time::Instant::now();{$a;}print!("{:?} ",time.elapsed())};
($a:expr,$b:literal)=>{let time=std::time::Instant::now();let mut a=0;for _ in 0..$b{a^=$a;}print!("{:?} {} ",time.elapsed(),a)}
}
#[inline]
pub fn pow_rust(x:i64, mut exp: u32) -> i64 {
    let mut base = x;
    let mut acc = 1;
    while exp > 1 {
        if (exp & 1) == 1 {
            acc = acc * base;
        }
        exp /= 2;
        base = base * base;
    }
    if exp == 1 {
        acc = acc * base;
    }
    acc
}
#[inline]
pub fn pow_new(x:i64, mut exp: u32) -> i64 {
    if exp==0{
        1
    }else{
        let mut base = x;
        let mut acc = 1;
        while exp > 1 {
            if (exp & 1) == 1 {
                acc = acc * base;
            }
            exp >>= 1;
            base = base * base;
        }
        acc * base
    }
}

fn main(){
let a=2i64;
let b=1_u32;
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
timing!(test::black_box(a).pow(test::black_box(b)),100000000);
timing!(pow_new(test::black_box(a),test::black_box(b)),100000000);
timing!(pow_rust(test::black_box(a),test::black_box(b)),100000000);
println!();
}
```
bench in my laptop:
```
neutron@Neutron:/me/rust$ rc commit.rs
rustc commit.rs  && ./commit

3.978419716s 0 4.079765171s 0 3.964630622s 0
3.997127013s 0 4.260304804s 0 3.997638211s 0
3.963195544s 0 4.11657718s 0 4.176054164s 0
3.830128579s 0 3.980396122s 0 3.937258567s 0
3.986055948s 0 4.127804162s 0 4.018943411s 0
4.185568857s 0 4.217512517s 0 3.98313603s 0
3.863018225s 0 4.030447988s 0 3.694878237s 0
4.206987927s 0 4.137608047s 0 4.115564664s 0
neutron@Neutron:/me/rust$ rc commit.rs -O
rustc commit.rs -O && ./commit

162.111993ms 0 165.107125ms 0 166.26924ms 0
175.20479ms 0 205.062565ms 0 176.278791ms 0
174.408975ms 0 166.526899ms 0 201.857604ms 0
146.190062ms 0 168.592821ms 0 154.61411ms 0
199.678912ms 0 168.411598ms 0 162.129996ms 0
147.420765ms 0 209.759326ms 0 154.807907ms 0
165.507134ms 0 188.476239ms 0 157.351524ms 0
121.320123ms 0 126.401229ms 0 114.86428ms 0
```
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 24, 2020
…arth

Rollup of 8 pull requests

Successful merges:

 - rust-lang#72954 (revise RwLock for HermitCore)
 - rust-lang#74367 (Rearrange the pipeline of `pow` to gain efficiency)
 - rust-lang#74491 (Optimize away BitAnd and BitOr when possible)
 - rust-lang#74639 (Downgrade glibc to 2.11.1 for ppc, ppc64 and s390x)
 - rust-lang#74661 (Refactor `region_name`: add `RegionNameHighlight`)
 - rust-lang#74692 (delay_span_bug instead of silent ignore)
 - rust-lang#74698 (fixed error reporting for mismatched traits)
 - rust-lang#74715 (Add a system for creating diffs across multiple mir optimizations.)

Failed merges:

r? @ghost
@bors bors merged commit 3226d72 into rust-lang:master Jul 24, 2020
@Neutron3529 Neutron3529 deleted the patch-1 branch July 25, 2020 03:30
@cuviper cuviper added this to the 1.47.0 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants