-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Description
I ran into an issue when working on a prime sieve.
I grouped three related variables (two arrays and an integer) into a struct to have less variables being passed around.
While profiling I noticed that the execution time went up from 2.8s to 7s when using the struct.
Based on some Discord hint´s I guess that shouldn´t be the case: https://discord.com/channels/273534239310479360/273541522815713281/843272480852672513
Here is my actual struct:
pub struct Aux {
pub sieve: [u64;_AUX_SIEVE_WORDS_ as usize],
pub primes: [u32;_NUMBER_OF_AUX_PRIMES_],
pub base: u32,
}
I created a gh repo, showing the different versions: https://github.com/ZuseZ4/perf_comp
Branch with_struct: I´m passing the struct by ref => 7s
Branch no_struct: I´m passing both arrays by ref, integer by value => 2.8s
Branch no_struct_ref: I´m passing both arrays and the integer by ref => 2.8s
Setup:
I use release mode, lto=true and target=native. (The last two seem to have no effect.)
rustc 1.52.1 (9bc8c42 2021-05-09)
(1.54.0-nightly (fe72845 2021-05-16) gives the same performance)
Ubuntu 20.04 (5.4.0-73-generic)
AMD® Athlon 200ge with 2x16 GB DDR4-3200
I could understand that an extra layer of indirection and therefore (if not removed by optimizations) pointer deref. could cause some performance cost, but I´m surprised about the large difference.
Did I miss something?