@@ -19,7 +19,57 @@ Finally, you can compare the performance of the two versions of `seq`
1919by running, for example,
2020
2121``` shell
22- hyperfine " seq 1000000" " target/release/seq 1000000"
22+ hyperfine -L seq seq,target/release/seq " {seq} 1000000"
23+ ```
24+
25+ ## Interesting test cases
26+
27+ Performance characteristics may vary a lot depending on the parameters,
28+ and if custom formatting is required. In particular, it does appear
29+ that the GNU implementation is heavily optimized for positive integer
30+ outputs (which is probably the most common use case for ` seq ` ).
31+
32+ Specifying a format or fixed width will slow down the
33+ execution a lot (~ 15-20 times on GNU ` seq ` ):
34+ ``` shell
35+ hyperfine -L seq seq,target/release/seq " {seq} -f%g 1000000"
36+ hyperfine -L seq seq,target/release/seq " {seq} -w 1000000"
37+ ```
38+
39+ Floating point increments, or any negative bound, also degrades the
40+ performance (~ 10-15 times on GNU ` seq ` ):
41+ ``` shell
42+ hyperfine -L seq seq,./target/release/seq " {seq} 0 0.000001 1"
43+ hyperfine -L seq seq,./target/release/seq " {seq} -100 1 1000000"
44+ ```
45+
46+ ## Optimizations
47+
48+ ### Buffering stdout
49+
50+ The original ` uutils ` implementation of ` seq ` did unbuffered writes
51+ to stdout, causing a large number of system calls (and therefore a large amount
52+ of system time). Simply wrapping ` stdout ` in a ` BufWriter ` increased performance
53+ by about 2 times for a floating point increment test case, leading to similar
54+ performance compared with GNU ` seq ` :
55+ ``` shell
56+ taskset -c 0 hyperfine -L seq seq,./seq-main,target/release/seq " {seq} 0 0.1 100000"
57+ Benchmark 1: seq 0 0.1 100000
58+ Time (mean ± σ): 161.6 ms ± 0.3 ms [User: 160.8 ms, System: 0.6 ms]
59+ Range (min … max): 161.2 ms … 162.4 ms 18 runs
60+
61+ Benchmark 2: ./seq-main 0 0.1 100000
62+ Time (mean ± σ): 282.7 ms ± 5.0 ms [User: 221.0 ms, System: 60.0 ms]
63+ Range (min … max): 279.7 ms … 296.2 ms 10 runs
64+
65+ Benchmark 3: target/release/seq 0 0.1 100000
66+ Time (mean ± σ): 143.8 ms ± 0.3 ms [User: 143.0 ms, System: 0.6 ms]
67+ Range (min … max): 143.2 ms … 144.4 ms 20 runs
68+
69+ Summary
70+ target/release/seq 0 0.1 100000 ran
71+ 1.12 ± 0.00 times faster than seq 0 0.1 100000
72+ 1.97 ± 0.03 times faster than ./seq-main 0 0.1 100000
2373```
2474
2575[ 0 ] : https://github.com/sharkdp/hyperfine
0 commit comments