factor: use num_prime crate's u64 and u128 factorization methods to speed up the performance of calculating prime numbers for u64 and u128 #9171

asder8215 · 2025-11-07T08:50:35Z

Currently, in factor.rs, it opts to use num_prime crate's factors() method only. num_prime also has specific prime factorization methods for u64 and u128 targets (factorize64() and factorize128()), which seems to be suited and faster for those situations (which improves partially upon the performance issue mentioned in #1456). This PR uses those specific methods in the event that we're given an integer that fits as a u64 or u128.

In terms of performance, running seq 2 10000000 | factor using num_prime's factors() vs num_prime's factorizeu64() shows the following results from hyperfine:

With factors()

$ hyperfine '~/coding-project/coreutils/target/release/seq 2 10000000 | ~/coding-project/coreutils/target/release/factor'
Benchmark 1: ~/coding-project/coreutils/target/release/seq 2 10000000 | ~/coding-project/coreutils/target/release/factor
  Time (mean ± σ):     16.946 s ±  0.147 s    [User: 14.600 s, System: 2.385 s]
  Range (min … max):   16.816 s … 17.175 s    10 runs

With factorize64()

$ hyperfine '~/coding-project/coreutils/target/release/seq 2 10000000 | ~/coding-project/coreutils/target/release/factor'
Benchmark 1: ~/coding-project/coreutils/target/release/seq 2 10000000 | ~/coding-project/coreutils/target/release/factor
  Time (mean ± σ):     11.778 s ±  0.087 s    [User: 9.526 s, System: 2.310 s]
  Range (min … max):   11.632 s … 11.950 s    10 runs

Any feedback and suggestions on making the changes in this PR compact would be nice as well. I haven't found a way to neatly convert BigUint to u64 or u128 from reading through the num_bigint crate docs, and due to using u64/u128/BigUint types for the BTreeMap, I have three different write_result methods to handle each case.

Pertaining to comparing this with GNU's factor, I'm skeptical about the timing results mentioned on hyperfine. Without any visible outputs, hyperfine indicates that GNU's factor runs the following command in 1.2 seconds:

$ hyperfine '~/coding-project/coreutils/target/release/seq 2 10000000 | factor'
Benchmark 1: ~/coding-project/coreutils/target/release/seq 2 10000000 | factor
  Time (mean ± σ):      1.224 s ±  0.004 s    [User: 1.264 s, System: 0.056 s]
  Range (min … max):    1.217 s …  1.230 s    10 runs

However, I read through this closed issue mentioned in hyperfine's repo: sharkdp/hyperfine#377 and it seems like there are situations where GNU's commands changes its execution if it detects that stdout is /dev/null (which Hyperfine attaches Stdio::null() to stdout of the command). I'm not certain if this is the case for how GNU's factor work, but when I run GNU's factor with --show-output flag, it gives me the following timing result (which seems to be more accurate on how long GNU's factor takes):

$ hyperfine --show-output '~/coding-project/coreutils/target/release/seq 2 10000000 | factor'
Time (mean ± σ):     16.148 s ±  0.134 s    [User: 3.926 s, System: 12.290 s]
  Range (min … max):   15.967 s … 16.433 s    10 runs

Comparing GNU's factor to the changes made by this PR with --show-output flag on in hyperfine shows the following:

$ hyperfine --show-output '~/coding-project/coreutils/target/release/seq 2 10000000 | ~/coding-project/coreutils/target/release/factor'
Time (mean ± σ):     28.907 s ±  0.177 s    [User: 15.908 s, System: 13.013 s]
  Range (min … max):   28.610 s … 29.177 s    10 runs

GNU's factor is definitely still faster than uutils' factor. In the future, it might be better to experiment with other prime factorization crates or make our own efficient implementation of prime factorization in Rust for performance gain.

github-actions · 2025-11-07T09:10:48Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

sylvestre · 2025-11-07T09:16:27Z

A few comments :

Looks great! Thanks for improving the performances!
hyperfine should be executed with all the arguments at once, not separated (helps to compare)
we should write benchmarks see src/uu/*/benches to have a baseline
please don't share code of the gnu implementation or look at it. The licenses are incompatible.

asder8215 · 2025-11-07T14:04:32Z

Noted! I removed the GNU link in my PR and I'll keep that in mind about showing results with hyperfine in the future.

In the meantime, I will start making benchmarking tests (and fix that cspell check it seems to be complaining about on Ubuntu).

codspeed-hq · 2025-11-07T16:08:09Z

CodSpeed Performance Report

Merging #9171 will improve performances by 3.5%

_{Comparing asder8215:factor_optimization (7622e9f) with main (859a1ed)}

Summary

⚡ 2 improvements
✅ 124 untouched
⏩ 2 skipped¹

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`factor_multiple_u128s[18446744073709551616]`	331.2 ms	323.7 ms	+2.31%
⚡	`factor_multiple_u64s[2]`	186.4 ms	180.1 ms	+3.5%

2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

github-actions · 2025-11-07T16:15:15Z

GNU testsuite comparison:

GNU test failed: tests/factor/factor. tests/factor/factor is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

asder8215 · 2025-11-07T16:56:06Z

From the big_uint benchmarking, I noticed there were some issues with retrieving the prime factors of certain numbers like 340282366920938463463374607431768211456768211458, which outputs the "Factorization incomplete" error that's in factor.rs file. I know GNU's factor produces valid prime factors for this value:
340282366920938463463374607431768211456768211458: 2 11 11 2875070119 2895113854549549 168931247951478372779

This issue is mentioned in #1559, but with the current implementation uutils' factor should support up to $2^{127} - 1$. For now, I'll put this as a separate GitHub issue.

github-actions · 2025-11-07T17:11:11Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

sylvestre · 2025-11-07T17:53:30Z

Could you please move the benchmarks into a separate pr so that we have a baseline? Thanks

asder8215 · 2025-11-07T22:45:22Z

I have the base benchmarking code for factor in #9182. #9171 can be merged first before #9182.

github-actions · 2025-11-07T23:09:03Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

…peed up the performance of calculating prime numbers for u64 and u128

github-actions · 2025-11-08T20:14:59Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

sylvestre · 2025-11-08T22:26:10Z

Merging #9171 will improve performances by 3.5%

not a huge win but still an improvement, thanks!

asder8215 force-pushed the factor_optimization branch from 0e02bf2 to d10f756 Compare November 7, 2025 22:27

asder8215 force-pushed the factor_optimization branch from 9d782de to 28fafb9 Compare November 8, 2025 19:20

sylvestre force-pushed the factor_optimization branch 2 times, most recently from 120fc9f to 1838797 Compare November 8, 2025 19:31

factor: use num_prime crate's u64 and u128 factorization methods to s…

7622e9f

…peed up the performance of calculating prime numbers for u64 and u128

asder8215 force-pushed the factor_optimization branch from 1838797 to 7622e9f Compare November 8, 2025 19:46

sylvestre merged commit 293e9d2 into uutils:main Nov 8, 2025
122 checks passed

asder8215 mentioned this pull request Nov 9, 2025

Factor: base benchmarking for single/multiple u64, u128, and >u128 #9182

Merged

BrewTestBot mentioned this pull request Nov 10, 2025

uutils-coreutils 0.4.0 Homebrew/homebrew-core#253755

Merged

moonfruit mentioned this pull request Nov 10, 2025

uutils-selected 0.4.0 moonfruit/homebrew-tap#359

Closed

Uh oh!

factor: use num_prime crate's u64 and u128 factorization methods to speed up the performance of calculating prime numbers for u64 and u128 #9171

factor: use num_prime crate's u64 and u128 factorization methods to speed up the performance of calculating prime numbers for u64 and u128 #9171

Uh oh!

Conversation

asder8215 commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 7, 2025

Uh oh!

sylvestre commented Nov 7, 2025

Uh oh!

asder8215 commented Nov 7, 2025

Uh oh!

codspeed-hq bot commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #9171 will improve performances by 3.5%

Summary

Benchmarks breakdown

Footnotes

Uh oh!

github-actions bot commented Nov 7, 2025

Uh oh!

asder8215 commented Nov 7, 2025

Uh oh!

github-actions bot commented Nov 7, 2025

Uh oh!

sylvestre commented Nov 7, 2025

Uh oh!

asder8215 commented Nov 7, 2025

Uh oh!

github-actions bot commented Nov 7, 2025

Uh oh!

github-actions bot commented Nov 8, 2025

Uh oh!

sylvestre commented Nov 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

asder8215 commented Nov 7, 2025 •

edited

Loading

codspeed-hq bot commented Nov 7, 2025 •

edited

Loading