Extended benchmark suite #893

sharkdp · 2021-11-26T20:35:02Z

(As discussed in #885,) the current benchmark suite has a few shortcomings.

The most obvious one is that there is no standardized dataset. Past ideas involved using large Git repositories (Linux, Chromium, Rust compiler), but these repositories change over time. We can not simply check out a certain state because the .git folder will still grow. It's also painfully slow to clone these large repositories. A better idea might be to create a benchmark folder programmatically with dummy content. This is not a trivial task though, because the folder content should be somewhat realistic. In terms of statistical properties (files per folder, average depth of subtrees, etc.). And ideally it should also reflect the state of a typical home folder that has been used for years (I'm thinking file system fragmentation... without knowing how much of an issue that would be).

Second, we have some benchmarks that mainly measure output speed of fd. These benchmarks currently write to /dev/null. They should probably be extended by other benchmarks where we write to another program via a pipe. And maybe to a file.

Third, we should add benchmarks that actually write to a TTY. We can do this with hyperfine s --show-output option (this will spoil the output of our regression.sh script). Note that this will then very much depend on the terminal emulator speed (for searches with a large amount of results).

Fourth, we should maybe move the benchmark scripts into this repository?

The text was updated successfully, but these errors were encountered:

tavianator · 2021-11-26T20:42:10Z

I don't think writing to the current TTY is ideal, it will depend too much on the performance of your particular terminal emulator. We could create a pseudo-terminal with unbuffer or tmux or something, which might give you a useful upper bound for the interactive performance.

I think it would be nice if these output types were hyperfine features. E.g. --output={null,pipe,tty,file}

sharkdp · 2021-11-26T20:47:24Z

We could create a pseudo-terminal with unbuffer or tmux or something, which might give you a useful upper bound for the interactive performance.

👍

I think it would be nice if these output types were hyperfine features. E.g. --output={null,pipe,tty,file}

Interesting idea! That would be a great extension of sharkdp/hyperfine#377, which essentially asks for --output=pipe, if I understand you correctly.

tavianator · 2021-11-26T20:53:32Z

Yeah that feature request essentially asks for --output=pipe

BattleCh1cken · 2022-10-11T12:59:45Z

Can I work on this?

tmccombs · 2022-10-11T18:19:22Z

Sure

BattleCh1cken · 2022-10-13T14:50:20Z

Should the benchmarks be a github workflow?

tavianator · 2022-10-13T20:51:47Z

I'm not sure the GitHub runners are consistent enough for reliable benchmarking

tavianator · 2023-09-29T19:57:27Z

(As discussed in #885,) the current benchmark suite has a few shortcomings.

The most obvious one is that there is no standardized dataset. Past ideas involved using large Git repositories (Linux, Chromium, Rust compiler), but these repositories change over time. We can not simply check out a certain state because the .git folder will still grow. It's also painfully slow to clone these large repositories.

I wrote a script for benchmarking bfs that might solve these issues: https://github.com/tavianator/bfs/blob/benchmarks/bench/clone-tree.sh. It checks out a specific tag/commit so the dataset should be reproducible. And it uses --filter=blob:none to avoid downloading any file contents so it's pretty fast.

sharkdp added packaging/tooling good first issue performance labels Nov 26, 2021

sharkdp mentioned this issue Mar 4, 2022

Switch from std::sync::mpsc to flume #942

Closed

tavianator mentioned this issue May 16, 2022

Implement --output={null,pipe,<FILE>} sharkdp/hyperfine#509

Merged

tavianator mentioned this issue Oct 13, 2023

Benchmark is unfair. find(1) should not be used for grepping. #1395

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extended benchmark suite #893

Extended benchmark suite #893

sharkdp commented Nov 26, 2021

tavianator commented Nov 26, 2021

sharkdp commented Nov 26, 2021

tavianator commented Nov 26, 2021

BattleCh1cken commented Oct 11, 2022

tmccombs commented Oct 11, 2022

BattleCh1cken commented Oct 13, 2022

tavianator commented Oct 13, 2022

tavianator commented Sep 29, 2023

Extended benchmark suite #893

Extended benchmark suite #893

Comments

sharkdp commented Nov 26, 2021

tavianator commented Nov 26, 2021

sharkdp commented Nov 26, 2021

tavianator commented Nov 26, 2021

BattleCh1cken commented Oct 11, 2022

tmccombs commented Oct 11, 2022

BattleCh1cken commented Oct 13, 2022

tavianator commented Oct 13, 2022

tavianator commented Sep 29, 2023