-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Generic code produce lots of no-ops compared to the monomorphic version. #8334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I also spy a |
I was curious about the amount of NOPs appearing in some code today, so I had a look around trying to deteremine where they originate. It turns out that the preferred alignment (on x86_64) for loop bodies is 16 Byte, so padding is introduced before loop bodies to ensure this. That is also what is happening here. The reason it's not aligned in the generic version is the additional |
@Florob: I think it's because we're doing target info wrong. @alexcrichton has some work in-progress that may fix that. I'm going to close this issue since I can't duplicate it on master. |
@thestinger, are you sure you can't reproduce? If so, perhaps this is an OSX-specific problem because I was able to reproduce the extra nops on master. Additionally, #8700 doesn't fix this :( |
What about |
I still see the nops :( |
@alexcrichton What code are you using exactly (i.e. how and how often do you call |
Oh interesting, I using this code: #[inline(never)]
fn doit_not_generic(a: f32) -> f32 {
let mut a = a;
do 1000000000.times {
a = a * a;
}
a
}
#[inline(never)]
fn doit<N: Mul<N, N>>(a: N) -> N {
let mut a = a;
do 1000000000.times {
a = a * a;
}
a
}
fn main() {
assert!(doit_not_generic(2.0f32) == doit(2.0f32));
} You are correct though that if I later call it with a different argument, the two codegens are the same. I'm a little surprised these aren't merged via the |
Generic code should produce the same code as their monomorphic counterparts. However, a lot of additional
nop
are produced on generic code. For example:When called with an
f32
, produced asm fordoit
has a lot ofnop
before the multiplication:Produced asm for
doit_not_generic
is nop-free before the multiplication:The text was updated successfully, but these errors were encountered: