Skip to content

Numeric operators dont inline well on generic code. #8333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sebcrozet opened this issue Aug 6, 2013 · 3 comments
Closed

Numeric operators dont inline well on generic code. #8333

sebcrozet opened this issue Aug 6, 2013 · 3 comments

Comments

@sebcrozet
Copy link
Contributor

Those two functions should generate the same asm code:

#[inline(never)]
fn doit_not_generic(a: f64) -> f64 {
    let mut a = a;
    do 1000000000.times {
        a = a * a;
    }

    a
}

#[inline(never)]
fn doit<N: Mul<N, N>>(a: N) -> N {
    let mut a = a;
    do 1000000000.times {
        a = a * a;
    }

    a
}

But they dont (at least, when called with f64).
Asm code for doit (there is an explicit call to the multiplication function):

0000000000400b50 <_ZN9doit_338817_2e876a1e2fea8cf17_0$x2e0E>:
  400b50:   64 48 3b 24 25 70 00    cmp    %fs:0x70,%rsp
  400b57:   00 00
  400b59:   77 1a                   ja     400b75 <_ZN9doit_338817_2e876a1e2fea8cf17_0$x2e0E+0x25>
  400b5b:   49 ba 28 00 00 00 00    movabs $0x28,%r10
  400b62:   00 00 00
  400b65:   49 bb 00 00 00 00 00    movabs $0x0,%r11
  400b6c:   00 00 00
  400b6f:   e8 4c 00 00 00          callq  400bc0 <__morestack>
  400b74:   c3                      retq
  400b75:   55                      push   %rbp
  400b76:   48 89 e5                mov    %rsp,%rbp
  400b79:   41 56                   push   %r14
  400b7b:   53                      push   %rbx
  400b7c:   48 83 ec 10             sub    $0x10,%rsp
  400b80:   48 b8 00 00 00 00 00    movabs $0x3ff0000000000000,%rax
  400b87:   00 f0 3f
  400b8a:   48 89 45 e8             mov    %rax,-0x18(%rbp)
  400b8e:   48 c7 c3 00 36 65 c4    mov    $0xffffffffc4653600,%rbx
  400b95:   4c 8d 75 e8             lea    -0x18(%rbp),%r14
  400b99:   90                      nop
  400b9a:   90                      nop
  400b9b:   90                      nop
  400b9c:   90                      nop
  400b9d:   90                      nop
  400b9e:   90                      nop
  400b9f:   90                      nop
  400ba0:   4c 89 f7                mov    %r14,%rdi
  400ba3:   4c 89 f6                mov    %r14,%rsi
  400ba6:   e8 35 fd ff ff          callq  4008e0 <_ZN3f6414__extensions__10meth_147683mul17_ce584f3346886dfe14_0$x2e8$x2dpreE@plt>
  400bab:   f2 0f 11 45 e8          movsd  %xmm0,-0x18(%rbp)
  400bb0:   48 ff c3                inc    %rbx
  400bb3:   75 eb                   jne    400ba0 <_ZN9doit_338817_2e876a1e2fea8cf17_0$x2e0E+0x50>
  400bb5:   48 83 c4 10             add    $0x10,%rsp
  400bb9:   5b                      pop    %rbx
  400bba:   41 5e                   pop    %r14
  400bbc:   5d                      pop    %rbp
  400bbd:   c3                      retq
  400bbe:   66 90                   xchg   %ax,%ax

Asm code for doit_not_generic (uses the machine instruction):

0000000000400a40 <_ZN16doit_not_generic17_2e876a1e2fea8cf17_0$x2e0E>:
  400a40:   64 48 3b 24 25 70 00    cmp    %fs:0x70,%rsp
  400a47:   00 00
  400a49:   77 1a                   ja     400a65 <_ZN16doit_not_generic17_2e876a1e2fea8cf17_0$x2e0E+0x25>
  400a4b:   49 ba 08 00 00 00 00    movabs $0x8,%r10
  400a52:   00 00 00
  400a55:   49 bb 00 00 00 00 00    movabs $0x0,%r11
  400a5c:   00 00 00
  400a5f:   e8 5c 01 00 00          callq  400bc0 <__morestack>
  400a64:   c3                      retq
  400a65:   55                      push   %rbp
  400a66:   48 89 e5                mov    %rsp,%rbp
  400a69:   48 c7 c0 00 36 65 c4    mov    $0xffffffffc4653600,%rax
  400a70:   f2 0f 59 c0             mulsd  %xmm0,%xmm0
  400a74:   48 ff c0                inc    %rax
  400a77:   75 f7                   jne    400a70 <_ZN16doit_not_generic17_2e876a1e2fea8cf17_0$x2e0E+0x30>
  400a79:   5d                      pop    %rbp
  400a7a:   c3                      retq
  400a7b:   90                      nop
  400a7c:   90                      nop
  400a7d:   90                      nop
  400a7e:   90                      nop
  400a7f:   90                      nop
@bstrie
Copy link
Contributor

bstrie commented Aug 6, 2013

#8332 filed to address the lack of inlining, but apparently the generic version still has a bunch of nops that the normal version does not have. That seems weird to me. Let's check the assembly again once #8332 has landed.

@bstrie
Copy link
Contributor

bstrie commented Aug 6, 2013

See #8334 for the nop issue.

@alexcrichton
Copy link
Member

With -Z lto, these two functions compile to essentially the exact same code. A lack of inlining on certain functions may be a problem, but specific bugs about them should be opened about the functions in question. Otherwise, there doesn't appear to be a fundamental bug here, so closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants