Skip to content

Commit 85fa12b

Browse files
committed
seq: Buffer writes to stdout
Use a BufWriter to wrap stdout: reduces the numbers of system calls, improves performance drastically (2x in some cases). Also document use cases in src/uu/seq/BENCHMARKING.md, and the optimization we have just done here.
1 parent c197a42 commit 85fa12b

File tree

2 files changed

+54
-4
lines changed

2 files changed

+54
-4
lines changed

src/uu/seq/BENCHMARKING.md

Lines changed: 51 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,57 @@ Finally, you can compare the performance of the two versions of `seq`
1919
by running, for example,
2020

2121
```shell
22-
hyperfine "seq 1000000" "target/release/seq 1000000"
22+
hyperfine -L seq seq,target/release/seq "{seq} 1000000"
23+
```
24+
25+
## Interesting test cases
26+
27+
Performance characteristics may vary a lot depending on the parameters,
28+
and if custom formatting is required. In particular, it does appear
29+
that the GNU implementation is heavily optimized for positive integer
30+
outputs (which is probably the most common use case for `seq`).
31+
32+
Specifying a format or fixed width will slow down the
33+
execution a lot (~15-20 times on GNU `seq`):
34+
```shell
35+
hyperfine -L seq seq,target/release/seq "{seq} -f%g 1000000"
36+
hyperfine -L seq seq,target/release/seq "{seq} -w 1000000"
37+
```
38+
39+
Floating point increments, or any negative bound, also degrades the
40+
performance (~10-15 times on GNU `seq`):
41+
```shell
42+
hyperfine -L seq seq,./target/release/seq "{seq} 0 0.000001 1"
43+
hyperfine -L seq seq,./target/release/seq "{seq} -100 1 1000000"
44+
```
45+
46+
## Optimizations
47+
48+
### Buffering stdout
49+
50+
The original `uutils` implementation of `seq` did unbuffered writes
51+
to stdout, causing a large number of system calls (and therefore a large amount
52+
of system time). Simply wrapping `stdout` in a `BufWriter` increased performance
53+
by about 2 times for a floating point increment test case, leading to similar
54+
performance compared with GNU `seq`:
55+
```shell
56+
taskset -c 0 hyperfine -L seq seq,./seq-main,target/release/seq "{seq} 0 0.1 100000"
57+
Benchmark 1: seq 0 0.1 100000
58+
Time (mean ± σ): 161.6 ms ± 0.3 ms [User: 160.8 ms, System: 0.6 ms]
59+
Range (min … max): 161.2 ms … 162.4 ms 18 runs
60+
61+
Benchmark 2: ./seq-main 0 0.1 100000
62+
Time (mean ± σ): 282.7 ms ± 5.0 ms [User: 221.0 ms, System: 60.0 ms]
63+
Range (min … max): 279.7 ms … 296.2 ms 10 runs
64+
65+
Benchmark 3: target/release/seq 0 0.1 100000
66+
Time (mean ± σ): 143.8 ms ± 0.3 ms [User: 143.0 ms, System: 0.6 ms]
67+
Range (min … max): 143.2 ms … 144.4 ms 20 runs
68+
69+
Summary
70+
target/release/seq 0 0.1 100000 ran
71+
1.12 ± 0.00 times faster than seq 0 0.1 100000
72+
1.97 ± 0.03 times faster than ./seq-main 0 0.1 100000
2373
```
2474

2575
[0]: https://github.com/sharkdp/hyperfine

src/uu/seq/src/seq.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
// file that was distributed with this source code.
55
// spell-checker:ignore (ToDO) bigdecimal extendedbigdecimal numberparse hexadecimalfloat
66
use std::ffi::OsString;
7-
use std::io::{stdout, ErrorKind, Write};
7+
use std::io::{stdout, BufWriter, ErrorKind, Write};
88

99
use clap::{Arg, ArgAction, Command};
1010
use num_traits::{ToPrimitive, Zero};
@@ -262,8 +262,8 @@ fn print_seq(
262262
padding: usize,
263263
format: Option<&Format<num_format::Float>>,
264264
) -> std::io::Result<()> {
265-
let stdout = stdout();
266-
let mut stdout = stdout.lock();
265+
let stdout = stdout().lock();
266+
let mut stdout = BufWriter::new(stdout);
267267
let (first, increment, last) = range;
268268
let mut value = first;
269269
let padding = if pad {

0 commit comments

Comments
 (0)