Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ls: reduce write syscalls & cleanup #2115

Merged
merged 3 commits into from
Apr 25, 2021

Conversation

tertsdiepraam
Copy link
Member

Instead of writing directly to stdout, with print! and println!, we can write to a BufWriter to reduce write syscalls. Doing so leads to the following improvements:

  • GNU is now 4.06 times faster instead of 4.31.
  • Our time spent on syscalls is now on par with GNU instead of 60% longer.
  • Number of write calls goes from 344064 to 1090 when running ls -R on the Firefox source code.

The full benchmarks are in the details tag below. Because print! doesn't return a Result and write! does, I simply discarded the return values of write!.

Because this was a simple change I figured I could sneak in a few other changes to clean up the code some more. I hope that's okay. These changes are:

  • Merging some imports
  • Use Entry API with the cache HashMaps.
  • Remove some redundant code from the indicator_style argument parsing.
  • Make display_file_type return a char instead of String.
Full Benchmarks

Before

❯ hyperfine --warmup 2 'ls -R' '../coreutils/target/release/coreutils ls -R'
Benchmark #1: ls -R
  Time (mean ± σ):     537.6 ms ±   6.9 ms    [User: 379.8 ms, System: 156.0 ms]
  Range (min … max):   527.6 ms … 553.5 ms    10 runs

Benchmark #2: ../coreutils/target/release/coreutils ls -R
Time (mean ± σ): 2.315 s ± 0.026 s [User: 2.089 s, System: 0.224 s]
Range (min … max): 2.287 s … 2.372 s 10 runs

Summary
'ls -R' ran
4.31 ± 0.07 times faster than '../coreutils/target/release/coreutils ls -R'

❯ strace -c ../coreutils/target/release/coreutils ls -R > /dev/null 
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ------------------
 72,26    6,736737          19    344064           write
 11,74    1,094781          29     37436           getdents64
  5,48    0,510923          27     18697           openat
  5,31    0,494837          26     18697           close
  5,14    0,479311          25     18697           newfstatat

After

❯ hyperfine --warmup 2 'ls -R' '../coreutils/target/release/coreutils ls -R'
Benchmark #1: ls -R
  Time (mean ± σ):     533.0 ms ±   3.2 ms    [User: 381.2 ms, System: 150.0 ms]
  Range (min … max):   528.2 ms … 540.5 ms    10 runs

Benchmark #2: ../coreutils/target/release/coreutils ls -R
Time (mean ± σ): 2.165 s ± 0.015 s [User: 1.997 s, System: 0.165 s]
Range (min … max): 2.144 s … 2.195 s 10 runs

Summary
'ls -R' ran
4.06 ± 0.04 times faster than '../coreutils/target/release/coreutils ls -R'

❯ strace -c ../coreutils/target/release/coreutils ls -R > /dev/null
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ------------------
 41,39    0,510284          13     37436           getdents64
 19,68    0,242657          12     18697           openat
 19,38    0,238876          12     18697           close
 18,60    0,229333          12     18697           newfstatat
  0,51    0,006314           5      1090           write

@sylvestre
Copy link
Contributor

well done :)

@sylvestre sylvestre merged commit e667cc2 into uutils:master Apr 25, 2021
@@ -1092,6 +1082,8 @@ fn list(locs: Vec<String>, config: Config) -> i32 {
let mut dirs = Vec::<PathData>::new();
let mut has_failed = false;

let mut out = BufWriter::new(stdout());
Copy link
Contributor

@siebenHeaven siebenHeaven Apr 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see if we can use something like in #2111 where we only buffer if stdout is not interactive (instead of buffering always).
Always buffering will slightly impact user experience when looking at output on a terminal as nothing will be printed until buffer is flushed.

Might I also suggest keeping this as a global because it might unnecessarily complicate function signatures (although globals are generally frowned upon, this seems like valid usecase to me - please feel free to ignore if you disagree :) )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured we could just always buffer because the output of ls should always be generated fairly quickly. At least when I tried there seemed to be no real visible difference. The only time this might be an issue is with an --ignore flag that hides a lot of files? One thing I didn't like about that PR was that it needed Box<dyn Write>. Maybe it'd be nice to create a StdoutWriter type in uucore that automatically buffers if's not interactive?

I chose not to make this a global, because write! requires a mutable reference, so it would needs in a Mutex if we use lazy_static, which also complicates the code. Anyway, if you can find a nicer way to implement this, feel free to do so!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I also noticed rust-lang/rust#60673 - seems like stdout buffering is already in the works.
As for the global related comment - yeah seems like the complicated lazy_static + mutex would be the way to implement it, I'll check if I can find a better way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That issue is very interesting, I hope some solution lands soon! Thanks for checking out the global variable!

@siebenHeaven
Copy link
Contributor

siebenHeaven commented Apr 25, 2021

looks good :)
Also thanks for cleaning up some things related to the gid'/uid caching. (I was seeing if we needed to get this into uucore - I do not see any usecases elsewhere that require caching at the moment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants