Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeWidthChar should inline an ascii fast path #1

Open
pnkfelix opened this issue Apr 19, 2015 · 4 comments
Open

UnicodeWidthChar should inline an ascii fast path #1

pnkfelix opened this issue Apr 19, 2015 · 4 comments

Comments

@pnkfelix
Copy link

(imported from improperly closed bug rust-lang/rust#20658 )

So, the good news is that this cargo package is beating the deprecated stdlib code on my machine.

The bad news is that it is still slower than what the original issue author expects to see.

(Though I also don't think the original issue author's code is conformant with what the docs here say the fn width method should do on char.)

Anyway, here is a port of the benchmark provided by the original author, adapted to also test this cargo package. It is followed with the benchmark results on my machine.

#![feature(test, unicode)]

extern crate test;
extern crate unicode_width;

use test::{Bencher};

use unicode_width::UnicodeWidthChar;

const STRING: &'static str = 
    "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";

#[bench]
fn stdlib(b: &mut Bencher) {
    b.iter(|| {
        for c in STRING.chars() {
            test::black_box(c.width(false));
        }
    });
}

#[bench]
fn cargo(b: &mut Bencher) {
    b.iter(|| {
        for c in STRING.chars() {
            test::black_box(UnicodeWidthChar::width(c));
        }
    });
}

#[bench]
fn simple(b: &mut Bencher) {
    b.iter(|| {
        for c in STRING.chars() {
            if (c as u32) < 127 {
                if (c as u32) > 31 {
                    test::black_box(Some(1));
                } else {
                    test::black_box(None::<usize>);
                }
            } else {
                test::black_box(c.width(false));
            }
        }
    });
}

03-03-58 char_fast_path/char_fast_path (git:fsk-sized-bounded-iter) % rustc --version
rustc 1.0.0-nightly (1284be404 2015-04-18) (built 2015-04-17)
03-04-12 char_fast_path/char_fast_path (git:fsk-sized-bounded-iter) % cargo bench
     Running target/release/char_fast_path-935b3c19b07d9282

running 3 tests
test cargo  ... bench:      3197 ns/iter (+/- 401)
test simple ... bench:      1465 ns/iter (+/- 500)
test stdlib ... bench:      3427 ns/iter (+/- 571)

test result: ok. 0 passed; 0 failed; 0 ignored; 3 measured

03-04-21 char_fast_path/char_fast_path (git:fsk-sized-bounded-iter) % 
@kwantam
Copy link
Member

kwantam commented Apr 19, 2015

Thanks! I'll have a look.

kwantam added a commit that referenced this issue Apr 19, 2015
This diff adds benchmarks to get more info regarding Issue #1.

It appears that the remaining difference between the "simple"
case and the "cargo" case is the result of a difference in
performance between using `match` and `if` for tight loops.

I suspect it's because of the way that match arms get reordered:
if I manually reorder the "if" statement, I can reproduce the
match performance.

Also added a couple #[inline] annotations in tables.rs, though
the difference in performance in my measurements is negligible.

Bumped version number to 0.1.1.
@kwantam
Copy link
Member

kwantam commented Apr 19, 2015

Strange. On the machines on which I've tested, it appears that the remaining difference in performance between the "simple" case and the "cargo" case is due to the use of if in the former and match in the latter. I created another "simple" case that uses match instead, and its performance is identical to the "cargo" case.

I suppose it might be worthwhile to modify the code to use an if statement, but then again one assumes that performance of generated code will increase in the future, and so it might not be worth it.

@pnkfelix
Copy link
Author

Hmm, that might be an interesting bug to file against the Rust match implementation, at least.

@kwantam
Copy link
Member

kwantam commented Apr 19, 2015

Thinking about it a bit more, it's not necessarily surprising that the if implementation is slightly faster, since the conditionals are ordered to check the desired case (printable ASCII) first. I haven't checked the LLVM IR, but I'm guessing that the match ends up ordering the checks differently, and this probably explains the difference.

I'll have a closer look at the generated IR and try to confirm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants