Skip to content

Commit 105042f

Browse files
sylvestrecakebaker
andauthored
document how to do good performance work (#7541)
* document how to do good performance work * doc: spell, ignore "taskset" Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com> --------- Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com>
1 parent ebe77c2 commit 105042f

File tree

1 file changed

+100
-0
lines changed

1 file changed

+100
-0
lines changed

docs/src/performance.md

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
<!-- spell-checker:ignore taskset -->
2+
3+
# Performance Profiling Tutorial
4+
5+
## Effective Benchmarking with Hyperfine
6+
7+
[Hyperfine](https://github.com/sharkdp/hyperfine) is a powerful command-line benchmarking tool that allows you to measure and compare execution times of commands with statistical rigor.
8+
9+
### Benchmarking Best Practices
10+
11+
When evaluating performance improvements, always set up your benchmarks to compare:
12+
13+
1. The GNU implementation as reference
14+
2. The implementation without the change
15+
3. The implementation with your change
16+
17+
This three-way comparison provides clear insights into:
18+
- How your implementation compares to the standard (GNU)
19+
- The actual performance impact of your specific change
20+
21+
### Example Benchmark
22+
23+
First, you will need to build the binary in release mode. Debug builds are significantly slower:
24+
25+
```bash
26+
cargo build --features unix --release
27+
```
28+
29+
```bash
30+
# Three-way comparison benchmark
31+
hyperfine \
32+
--warmup 3 \
33+
"/usr/bin/ls -R ." \
34+
"./target/release/coreutils.prev ls -R ." \
35+
"./target/release/coreutils ls -R ."
36+
37+
# can be simplified with:
38+
hyperfine \
39+
--warmup 3 \
40+
-L ls /usr/bin/ls,"./target/release/coreutils.prev ls","./target/release/coreutils ls" \
41+
"{ls} -R ."
42+
```
43+
44+
```
45+
# to improve the reproducibility of the results:
46+
taskset -c 0
47+
```
48+
49+
### Interpreting Results
50+
51+
Hyperfine provides summary statistics including:
52+
- Mean execution time
53+
- Standard deviation
54+
- Min/max times
55+
- Relative performance comparison
56+
57+
Look for consistent patterns rather than focusing on individual runs, and be aware of system noise that might affect results.
58+
59+
## Using Samply for Profiling
60+
61+
[Samply](https://github.com/mstange/samply) is a sampling profiler that helps you identify performance bottlenecks in your code.
62+
63+
### Basic Profiling
64+
65+
```bash
66+
# Generate a flame graph for your application
67+
samply record ./target/debug/coreutils ls -R
68+
69+
# Profile with higher sampling frequency
70+
samply record --rate 1000 ./target/debug/coreutils seq 1 1000
71+
```
72+
73+
## Workflow: Measuring Performance Improvements
74+
75+
1. **Establish baselines**:
76+
```bash
77+
hyperfine --warmup 3 \
78+
"/usr/bin/sort large_file.txt" \
79+
"our-sort-v1 large_file.txt"
80+
```
81+
82+
2. **Identify bottlenecks**:
83+
```bash
84+
samply record ./our-sort-v1 large_file.txt
85+
```
86+
87+
3. **Make targeted improvements** based on profiling data
88+
89+
4. **Verify improvements**:
90+
```bash
91+
hyperfine --warmup 3 \
92+
"/usr/bin/sort large_file.txt" \
93+
"our-sort-v1 large_file.txt" \
94+
"our-sort-v2 large_file.txt"
95+
```
96+
97+
5. **Document performance changes** with concrete numbers
98+
```bash
99+
hyperfine --export-markdown file.md [...]
100+
```

0 commit comments

Comments
 (0)