Skip to content

Conversation

@mattsu2020
Copy link
Contributor

Summary

  • add a direct libc dependency so sort can query sysconf-style memory data on Unix targets
  • auto-compute the main buffer size using both file size hints and available memory and fall back to sane defaults when either is missing
  • update chunk growth and external-sort buffers to follow the new heuristic and avoid wasteful allocations

Testing

  • cargo check --workspace

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

@codspeed-hq
Copy link

codspeed-hq bot commented Oct 20, 2025

CodSpeed Performance Report

Merging #8959 will degrade performances by 3.08%

Comparing mattsu2020:sort-memory-functions (72201b2) with main (4c68e1e)

Summary

❌ 1 regression
✅ 105 untouched
⏩ 73 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
sort_long_line[160000] 1.8 ms 1.9 ms -3.08%

Footnotes

  1. 73 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Added 'sysconf' to the CSpell jargon dictionary to include the system configuration function term, preventing false positives in spell-checking for technical code.
@sylvestre
Copy link
Contributor

i added comments in the other PR

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

Moved buffer size calculation functions (e.g., automatic_buffer_size, file_size_hint) to a new buffer_hint module for better code organization and modularity. Removed the Linux-specific nix dependency as the memory hint functionality is now handled internally without external crates.
Updated the `physical_memory_bytes_unix` function to explicitly cast `pages` and `page_size` to `u128` before multiplication, ensuring safe arithmetic and preventing potential overflow. Also added "libc" dependency to `fuzz/Cargo.lock` to support the changes.
…ory_bytes_unix

Reformatted the multi-line `#[cfg(...)]` attribute and reordered imports in `physical_memory_bytes_unix()` for better code clarity and consistency.
…bytes_unix

Minor code style cleanup to improve readability and consistency in the buffer hint module.
…y_bytes

The explicit return is redundant in Rust, as the last expression in a block is implicitly returned, improving code style and adherence to idiomatic practices.
Corrected "overcommitting" to "overcommit" in the comment for accurate spelling.
@mattsu2020
Copy link
Contributor Author

https://github.com/nix-rust/nix/blob/e1e630f1b942a222f59c42cee63af16b2a925f15/src/unistd.rs#L3138C1-L3141C40
Based on the nix source code, it depends on libc since it may not be usable in some environments.

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

mattsu2020 and others added 2 commits October 21, 2025 16:17
Added standard copyright and license header to the buffer_hint.rs file in the sort utility to comply with project licensing requirements and ensure proper attribution.
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@mattsu2020 mattsu2020 requested a review from sylvestre October 21, 2025 11:14
Removal of unnecessary variables

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
#[cfg(target_os = "linux")]
{
match get_rlimit() {
Ok(limit) => limit.saturating_div(4).clamp(32, 256),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please explain what it is doing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The batch size is dynamically adjusted based on the number of file descriptors. Comments and processing have been modified for clarity.

Added a comment in the `physical_memory_bytes` function for non-Unix or Redox targets to clarify why `None` is returned, improving code readability and documenting the absence of a portable API for detecting total physical memory.
Updated cfg conditions in `physical_memory_bytes` to explicitly handle Linux and Android targets. Refined `physical_memory_bytes_unix` with improved error handling for sysconf calls, ensuring safer memory page and size calculations while removing fallback libc implementation for non-Linux/Android Unix systems.
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

… batch size

Removed the libc crate dependency from Cargo.toml and Cargo.lock to reduce unnecessary dependencies. Refactored default_merge_batch_size() in sort.rs to use named constants (LINUX_BATCH_DIVISOR, LINUX_BATCH_MIN, LINUX_BATCH_MAX) for better code readability and maintainability, while preserving the same dynamic batch size calculation logic based on file descriptor limits.
Reordered the imports in the `use nix::unistd` statement to place `SysconfVar` before `sysconf`, improving code readability and adhering to alphabetical import ordering conventions.
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@mattsu2020 mattsu2020 requested a review from sylvestre October 21, 2025 12:44
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

@mattsu2020 sorry but why are you rebasing your pr very often?

@mattsu2020
Copy link
Contributor Author

mattsu2020 commented Oct 22, 2025

@mattsu2020 sorry but why are you rebasing your pr very often?

I wasn't sure if it was keeping up with the latest updates, so I ended up updating it frequently.
I'll be careful to avoid making it difficult to review.

@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

Yeah, please let the reviewer do that :) thanks

@sylvestre sylvestre merged commit 909da50 into uutils:main Oct 23, 2025
121 of 123 checks passed
@mattsu2020 mattsu2020 deleted the sort-memory-functions branch October 23, 2025 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants