Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ipv4Addr cmp() is slow #33885

Closed
polachok opened this issue May 26, 2016 · 1 comment
Closed

Ipv4Addr cmp() is slow #33885

polachok opened this issue May 26, 2016 · 1 comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@polachok
Copy link
Contributor

I'm using Ipv4Addrs as keys in a BtreeMap. I have to look up the map ~15 million times per second (10G ethernet line rate).

Ipv4Addr cmp() compiles to this on my system (linux x86-64, rustc 1.10.0-nightly (476fe6e 2016-05-21))

0000000000051f10 <_ZN52_$LT$net..ip..Ipv4Addr$u20$as$u20$core..cmp..Ord$GT$3cmp17h39591ec7a18c4b02E>:
   51f10:       50                      push   %rax
   51f11:       c7 44 24 04 1d 1d 1d    movl   $0x1d1d1d1d,0x4(%rsp)
   51f18:       1d 
   51f19:       c7 04 24 1d 1d 1d 1d    movl   $0x1d1d1d1d,(%rsp)
   51f20:       8b 07                   mov    (%rdi),%eax
   51f22:       89 c1                   mov    %eax,%ecx
   51f24:       88 44 24 04             mov    %al,0x4(%rsp)
   51f28:       88 64 24 05             mov    %ah,0x5(%rsp)
   51f2c:       c1 e8 10                shr    $0x10,%eax
   51f2f:       c1 e9 18                shr    $0x18,%ecx
   51f32:       88 44 24 06             mov    %al,0x6(%rsp)
   51f36:       88 4c 24 07             mov    %cl,0x7(%rsp)
   51f3a:       8b 06                   mov    (%rsi),%eax
   51f3c:       89 c1                   mov    %eax,%ecx
   51f3e:       88 04 24                mov    %al,(%rsp)
   51f41:       88 64 24 01             mov    %ah,0x1(%rsp)
   51f45:       c1 e8 10                shr    $0x10,%eax
   51f48:       c1 e9 18                shr    $0x18,%ecx
   51f4b:       88 44 24 02             mov    %al,0x2(%rsp)
   51f4f:       88 4c 24 03             mov    %cl,0x3(%rsp)
   51f53:       48 8d 7c 24 04          lea    0x4(%rsp),%rdi
   51f58:       48 8d 34 24             lea    (%rsp),%rsi
   51f5c:       ba 04 00 00 00          mov    $0x4,%edx
   51f61:       e8 0a 17 fc ff          callq  13670 <memcmp@plt>
   51f66:       89 c1                   mov    %eax,%ecx
   51f68:       31 c0                   xor    %eax,%eax
   51f6a:       85 c9                   test   %ecx,%ecx
   51f6c:       b1 ff                   mov    $0xff,%cl
   51f6e:       78 02                   js     51f72 <_ZN52_$LT$net..ip..Ipv4Addr$u20$as$u20$core..cmp..Ord$GT$3cmp17h39591ec7a18c4b02E+0x62>
   51f70:       b1 01                   mov    $0x1,%cl
   51f72:       74 02                   je     51f76 <_ZN52_$LT$net..ip..Ipv4Addr$u20$as$u20$core..cmp..Ord$GT$3cmp17h39591ec7a18c4b02E+0x66>
   51f74:       88 c8                   mov    %cl,%al
   51f76:       59                      pop    %rcx

Which seems kinda inefficient for a thing which is basically u32.
I guess part of the reason is the implementation which converts it to an array(!) first.
I copied the definition and implemented Ord like this:

impl Ord for Ipv4Addr2 {
    fn cmp(&self, other: &Self) -> cmp::Ordering {
        return Ord::cmp(&ntoh(self.inner.s_addr), &ntoh(other.inner.s_addr));
    }
}

It's about 10 times faster on my benchmark.

rustc 1.10.0-nightly (476fe6eef 2016-05-21)
binary: rustc
commit-hash: 476fe6eefe17db91ff7a60aab34aa67a0a750a18
commit-date: 2016-05-21
host: x86_64-unknown-linux-gnu
release: 1.10.0-nightly
@alexcrichton
Copy link
Member

Nice find! Want to send a PR for this? Looks like something that'd be more than welcome :)

@apasel422 apasel422 added A-libs I-slow Issue: Problems and improvements with respect to performance of generated code. labels May 26, 2016
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue May 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

3 participants