Skip to content

Conversation

@asder8215
Copy link
Contributor

@asder8215 asder8215 commented Nov 7, 2025

Currently, in factor.rs, it opts to use num_prime crate's factors() method only. num_prime also has specific prime factorization methods for u64 and u128 targets (factorize64() and factorize128()), which seems to be suited and faster for those situations (which improves partially upon the performance issue mentioned in #1456). This PR uses those specific methods in the event that we're given an integer that fits as a u64 or u128.

In terms of performance, running seq 2 10000000 | factor using num_prime's factors() vs num_prime's factorizeu64() shows the following results from hyperfine:

With factors()

$ hyperfine '~/coding-project/coreutils/target/release/seq 2 10000000 | ~/coding-project/coreutils/target/release/factor'
Benchmark 1: ~/coding-project/coreutils/target/release/seq 2 10000000 | ~/coding-project/coreutils/target/release/factor
  Time (mean ± σ):     16.946 s ±  0.147 s    [User: 14.600 s, System: 2.385 s]
  Range (min … max):   16.816 s … 17.175 s    10 runs

With factorize64()

$ hyperfine '~/coding-project/coreutils/target/release/seq 2 10000000 | ~/coding-project/coreutils/target/release/factor'
Benchmark 1: ~/coding-project/coreutils/target/release/seq 2 10000000 | ~/coding-project/coreutils/target/release/factor
  Time (mean ± σ):     11.778 s ±  0.087 s    [User: 9.526 s, System: 2.310 s]
  Range (min … max):   11.632 s … 11.950 s    10 runs

Any feedback and suggestions on making the changes in this PR compact would be nice as well. I haven't found a way to neatly convert BigUint to u64 or u128 from reading through the num_bigint crate docs, and due to using u64/u128/BigUint types for the BTreeMap, I have three different write_result methods to handle each case.

Pertaining to comparing this with GNU's factor, I'm skeptical about the timing results mentioned on hyperfine. Without any visible outputs, hyperfine indicates that GNU's factor runs the following command in 1.2 seconds:

$ hyperfine '~/coding-project/coreutils/target/release/seq 2 10000000 | factor'
Benchmark 1: ~/coding-project/coreutils/target/release/seq 2 10000000 | factor
  Time (mean ± σ):      1.224 s ±  0.004 s    [User: 1.264 s, System: 0.056 s]
  Range (min … max):    1.217 s …  1.230 s    10 runs

However, I read through this closed issue mentioned in hyperfine's repo: sharkdp/hyperfine#377 and it seems like there are situations where GNU's commands changes its execution if it detects that stdout is /dev/null (which Hyperfine attaches Stdio::null() to stdout of the command). I'm not certain if this is the case for how GNU's factor work, but when I run GNU's factor with --show-output flag, it gives me the following timing result (which seems to be more accurate on how long GNU's factor takes):

$ hyperfine --show-output '~/coding-project/coreutils/target/release/seq 2 10000000 | factor'
Time (mean ± σ):     16.148 s ±  0.134 s    [User: 3.926 s, System: 12.290 s]
  Range (min … max):   15.967 s … 16.433 s    10 runs

Comparing GNU's factor to the changes made by this PR with --show-output flag on in hyperfine shows the following:

$ hyperfine --show-output '~/coding-project/coreutils/target/release/seq 2 10000000 | ~/coding-project/coreutils/target/release/factor'
Time (mean ± σ):     28.907 s ±  0.177 s    [User: 15.908 s, System: 13.013 s]
  Range (min … max):   28.610 s … 29.177 s    10 runs

GNU's factor is definitely still faster than uutils' factor. In the future, it might be better to experiment with other prime factorization crates or make our own efficient implementation of prime factorization in Rust for performance gain.

@github-actions
Copy link

github-actions bot commented Nov 7, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

@sylvestre
Copy link
Contributor

A few comments :

  • Looks great! Thanks for improving the performances!
  • hyperfine should be executed with all the arguments at once, not separated (helps to compare)
  • we should write benchmarks see src/uu/*/benches to have a baseline
  • please don't share code of the gnu implementation or look at it. The licenses are incompatible.

@asder8215
Copy link
Contributor Author

Noted! I removed the GNU link in my PR and I'll keep that in mind about showing results with hyperfine in the future.

In the meantime, I will start making benchmarking tests (and fix that cspell check it seems to be complaining about on Ubuntu).

@codspeed-hq
Copy link

codspeed-hq bot commented Nov 7, 2025

CodSpeed Performance Report

Merging #9171 will improve performances by 3.5%

Comparing asder8215:factor_optimization (7622e9f) with main (859a1ed)

Summary

⚡ 2 improvements
✅ 124 untouched
⏩ 2 skipped1

Benchmarks breakdown

Benchmark BASE HEAD Change
factor_multiple_u128s[18446744073709551616] 331.2 ms 323.7 ms +2.31%
factor_multiple_u64s[2] 186.4 ms 180.1 ms +3.5%

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link

github-actions bot commented Nov 7, 2025

GNU testsuite comparison:

GNU test failed: tests/factor/factor. tests/factor/factor is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@asder8215
Copy link
Contributor Author

From the big_uint benchmarking, I noticed there were some issues with retrieving the prime factors of certain numbers like 340282366920938463463374607431768211456768211458, which outputs the "Factorization incomplete" error that's in factor.rs file. I know GNU's factor produces valid prime factors for this value:
340282366920938463463374607431768211456768211458: 2 11 11 2875070119 2895113854549549 168931247951478372779

This issue is mentioned in #1559, but with the current implementation uutils' factor should support up to $2^{127} - 1$. For now, I'll put this as a separate GitHub issue.

@github-actions
Copy link

github-actions bot commented Nov 7, 2025

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

Could you please move the benchmarks into a separate pr so that we have a baseline? Thanks

@asder8215 asder8215 force-pushed the factor_optimization branch from 0e02bf2 to d10f756 Compare November 7, 2025 22:27
@asder8215
Copy link
Contributor Author

I have the base benchmarking code for factor in #9182. #9171 can be merged first before #9182.

@github-actions
Copy link

github-actions bot commented Nov 7, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

@asder8215 asder8215 force-pushed the factor_optimization branch from 9d782de to 28fafb9 Compare November 8, 2025 19:20
@sylvestre sylvestre force-pushed the factor_optimization branch 2 times, most recently from 120fc9f to 1838797 Compare November 8, 2025 19:31
…peed up the performance of calculating prime numbers for u64 and u128
@asder8215 asder8215 force-pushed the factor_optimization branch from 1838797 to 7622e9f Compare November 8, 2025 19:46
@github-actions
Copy link

github-actions bot commented Nov 8, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

@sylvestre
Copy link
Contributor

Merging #9171 will improve performances by 3.5%

not a huge win but still an improvement, thanks!

@sylvestre sylvestre merged commit 293e9d2 into uutils:main Nov 8, 2025
122 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants