-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
sort: immediately compare whole lines if they parse as numbers #7567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Could you please run hyperfine with all the commands at once ? and what clang/gcc have to do here? :) |
|
GNU testsuite comparison: |
|
|
well done could you please add your benchmark: https://github.com/uutils/coreutils/blob/main/src/uu/sort/BENCHMARKING.md |
A simple or |
|
thanks :) |
|
please update the .md and we are good! |
I'm sorry, but I'm not sure what you're referring to. |
|
What happens with integer numbers that cannot be represented precisely as f64? For example 123456789012345678 and 123456789012345679 should parse into identical f64. |
|
if let Some(cmp) = a_f64.partial_cmp(b_f64) {
// don't trust `Ordering::Equal` if lines are not fully equal
if cmp != Ordering::Equal || a.line == b.line {
return if global_settings.reverse {
cmp.reverse()
} else {
cmp
};
}
} |
|
@MoSal could you please add your benchmark: https://github.com/uutils/coreutils/blob/main/src/uu/sort/BENCHMARKING.md :) |
Numeric sort can be relatively slow on inputs that are wholly or
mostly numbers. This is more clear when comparing with the speed of
GeneralNumeric.
This change parses whole lines as f64 and stores that info in
`LineData`. This is faster than doing the parsing two lines at
a time in `compare_by()`.
# Benchmarks
`shuf -i 1-1000000 -n 1000000 > /tmp/shuffled.txt`
% hyperfine --warmup 3 \
'/tmp/gnu-sort -n /tmp/shuffled.txt'
'/tmp/before_coreutils sort -n /tmp/shuffled.txt'
'/tmp/after_coreutils sort -n /tmp/shuffled.txt'
Benchmark 1: /tmp/gnu-sort -n /tmp/shuffled.txt
Time (mean ± σ): 198.2 ms ± 5.8 ms [User: 884.6 ms, System: 22.0 ms]
Range (min … max): 187.3 ms … 207.4 ms 15 runs
Benchmark 2: /tmp/before_coreutils sort -n /tmp/shuffled.txt
Time (mean ± σ): 361.3 ms ± 8.7 ms [User: 1898.7 ms, System: 18.9 ms]
Range (min … max): 350.4 ms … 375.3 ms 10 runs
Benchmark 3: /tmp/after_coreutils sort -n /tmp/shuffled.txt
Time (mean ± σ): 175.1 ms ± 6.7 ms [User: 536.8 ms, System: 21.6 ms]
Range (min … max): 169.3 ms … 197.0 ms 16 runs
Summary
/tmp/after_coreutils sort -n /tmp/shuffled.txt ran
1.13 ± 0.05 times faster than /tmp/gnu-sort -n /tmp/shuffled.txt
2.06 ± 0.09 times faster than /tmp/before_coreutils sort -n /tmp/shuffled.txt
Signed-off-by: Mohammad AlSaleh <CE.Mohammad.AlSaleh@gmail.com>
Signed-off-by: Mohammad AlSaleh <CE.Mohammad.AlSaleh@gmail.com>
Done. Also shortened the original commit message, replacing all the redundant benchmarks with the single hyperfine run. |
|
Thank you for your contribution ! |
Numeric sort can be relatively slow on inputs that are wholly or
mostly numbers. This is more clear when comparing with the speed of
GeneralNumeric.
This change parses whole lines as f64 and stores that info in
LineData. This is faster than doing the parsing two lines ata time in
compare_by().Benchmarks
Before
default_releasecodegen-units=1-C target-cpu=nativecodegen-units=1-C target-cpu=native,global_allocator=mimalloc-C target-cpu=native,global_allocator=snmallocAfter
default_releasecodegen-units=1-C target-cpu=nativecodegen-units=1-C target-cpu=native,global_allocator=mimalloccodegen-units=1-C target-cpu=native,global_allocator=snmallocGNU
gcc-march=x86-64 -mtune=generic -O2 ...(Arch package)clang-march=native -O3 -pipe -fstack-protector-strong -fno-pltgcc-march=native -O3 -pipe -fstack-protector-strong -fno-pltsort -gNumbers for comparisonGNU
gcc-march=x86-64 -mtune=generic -O2 ...(Arch package)clang-march=native -O3 -pipe -fstack-protector-strong -fno-pltgcc-march=native -O3 -pipe -fstack-protector-strong -fno-pltuutils
default_releasecodegen-units=1-C target-cpu=nativecodegen-units=1-C target-cpu=native,global_allocator=mimalloccodegen-units=1-C target-cpu=native,global_allocator=snmalloc