-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
more: constant memory initialization overhead #7765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
442ddb7 to
7649214
Compare
|
waiting for #7680 to be merged |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
4f212a6 to
a981d4c
Compare
|
GNU testsuite comparison: |
| const USAGE: &str = help_usage!("more.md"); | ||
| const BELL: &str = "\x07"; | ||
| const BELL: char = '\x07'; | ||
| const MULTI_FILE_TOP_PROMPT: &str = "\r::::::::::::::\n\r{}\n\r::::::::::::::\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a bit cryptic, could you please add a comment for this? thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do @sylvestre. Please note that this PR is still WIP and not ready for full review. I'm awaiting my previous PR to be merged. Once that is done, I can better compare changes to main.
I will also do a writeup on the changes made, benchmarking, and add comments in the code.
|
GNU testsuite comparison: |
|
@tertsdiepraam @sylvestre here is how I plan to refactor
Would be great to hear your feedback! |
61ddf17 to
148d6a2
Compare
|
GNU testsuite comparison: |
148d6a2 to
310d807
Compare
|
Benchmark < /dev/urandom base64 | fold -w 120 | head -c 100M > bigfile
< /dev/urandom base64 | fold -w 120 | head -c 1M > smallfile
cargo build --no-default-features --features more --release
# file
/usr/bin/time ./target/release/coreutils more bigfile
/usr/bin/time ./target/release/coreutils more smallfile
# pipe
/usr/bin/time bash -c 'cat bigfile | ./target/release/coreutils more'
/usr/bin/time bash -c 'cat smallfile | ./target/release/coreutils more'
# in `main`
# bigfile: 0.37user 0.11system 0:01.81elapsed 26%CPU (0avgtext+0avgdata 126796maxresident)k 0inputs+0outputs (0major+31065minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.21elapsed 0%CPU (0avgtext+0avgdata 4540maxresident)k 0inputs+0outputs (0major+452minor)pagefaults 0swaps
# bigfile(pipe): 0.37user 0.15system 0:01.97elapsed 26%CPU (0avgtext+0avgdata 126832maxresident)k 0inputs+0outputs (0major+31430minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.01system 0:01.63elapsed 1%CPU (0avgtext+0avgdata 4344maxresident)k 0inputs+0outputs (0major+812minor)pagefaults 0swaps
# in `more-mem-constant`
# bigfile: 0.00user 0.00system 0:01.04elapsed 0%CPU (0avgtext+0avgdata 3084maxresident)k 0inputs+0outputs (0major+131minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.05elapsed 0%CPU (0avgtext+0avgdata 3224maxresident)k 0inputs+0outputs (0major+131minor)pagefaults 0swaps
# bigfile(pipe): 0.00user 0.00system 0:01.15elapsed 0%CPU (0avgtext+0avgdata 3528maxresident)k 0inputs+0outputs (0major+488minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.00system 0:01.13elapsed 0%CPU (0avgtext+0avgdata 3592maxresident)k 0inputs+0outputs (0major+491minor)pagefaults 0swaps
# system
# bigfile: 0.00user 0.00system 0:00.95elapsed 0%CPU (0avgtext+0avgdata 2508maxresident)k 0inputs+0outputs (0major+136minor)pagefaults 0swaps
# smallfile: 0.00user 0.00system 0:01.15elapsed 0%CPU (0avgtext+0avgdata 2508maxresident)k 0inputs+0outputs (0major+136minor)pagefaults 0swaps
# bigfile(pipe): 0.00user 0.00system 0:01.11elapsed 0%CPU (0avgtext+0avgdata 3584maxresident)k 0inputs+0outputs (0major+494minor)pagefaults 0swaps
# smallfile(pipe): 0.00user 0.00system 0:01.10elapsed 0%CPU (0avgtext+0avgdata 3524maxresident)k 0inputs+0outputs (0major+492minor)pagefaults 0swapsGreatly reduced time to first paint and memory usage (41x) for large files. Verified that the overhead is constant with respect to input size. |
|
GNU testsuite comparison: |
310d807 to
686f4b3
Compare
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
347c086 to
022ed0f
Compare
|
GNU testsuite comparison: |
022ed0f to
f138ac5
Compare
| pub fn uumain(args: impl uucore::Args) -> UResult<()> { | ||
| let _guard = TerminalGuard; | ||
|
|
||
| // Disable raw mode before exiting if a panic occurs | ||
| set_hook(Box::new(|panic_info| { | ||
| terminal::disable_raw_mode().unwrap(); | ||
| print!("\r"); | ||
| println!("{panic_info}"); | ||
| })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic for modifying the terminal has been moved from uumain to more.
|
GNU testsuite comparison: |
|
@tertsdiepraam could you please review this PR? |
|
GNU testsuite comparison: |
|
Also, can you please squash the multiple commits in just one or two implementation steps ? |
Could the PR be squashed into one commit during merge? |
dd0da28 to
81ca89a
Compare
|
GNU testsuite comparison: |
81ca89a to
3c91ece
Compare
|
GNU testsuite comparison: |
3c91ece to
37a94e1
Compare
|
GNU testsuite comparison: |
|
I am facing some issues with running tests locally even though they pass in the test runner. Please give me some more time to fix it. |
37a94e1 to
bfcfb04
Compare
|
GNU testsuite comparison: |
a9d4a8f to
d4bc6b3
Compare
|
I have "fixed" the tests by migrating them to |
| enum OutputType { | ||
| Tty(Stdout), | ||
| Pipe(Box<dyn Write>), | ||
| #[cfg(test)] | ||
| Test(Vec<u8>), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This enum is used to distinguish between stdout types and provide a way for tests to verify output.
|
GNU testsuite comparison: |
Do you have the results of re-running the benchmark with your latest changes? |
yes, I've updated it in the previous comment here: #7765 (comment) |
|
I've just finished reviewing. I did not see any code smell, it's very readable imo and However, this is a big refactor, and I have a pretty low confidence level in my own competence to review this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this looks really good. Thanks! You've crammed a lot into this PR (and basically 1 commit), which makes it hard to review. I've got some comments, but we can probably merge this after those have been fixed. Next time, please split it up over multiple PRs.
| (Some(n), _) | (None, Some(n)) if n > 0 => Some(n + 1), | ||
| _ => None, // Use terminal height |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next time, please separate cleanup from actual changes. That'll make it much easier to review.
src/uu/more/src/more.rs
Outdated
| print!("\r"); | ||
| stdout.flush().unwrap(); | ||
| #[cfg(test)] | ||
| impl Deref for OutputType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to involved for deref, I'd rather have a unwrap_test method or something more explicit like that. Only implementing this for cfg(test) also makes me a bit worried that the behaviour might change inside and outside of tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this into mod tests so it should not change the original implementation. Besides, the unreachable macro should catch errors if not used appropriately.
| } | ||
| } | ||
|
|
||
| #[cfg(target_os = "fuchsia")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you remove this intentionally? Shouldn't we keep it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I've added it back.
src/uu/more/src/more.rs
Outdated
| lines: Vec<String>, // Lines read from the input | ||
| cumulative_line_sizes: Vec<u64>, // Cumulative byte sizes of lines | ||
| upper_mark: usize, // Current line at the top of the screen | ||
| content_rows: usize, // Number of rows that fit on the screen | ||
| eof_reached: bool, | ||
| lines_squeezed: usize, // Number of lines squeezed out in the current view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments should probably be docstrings.
| reset_term()?; | ||
| std::process::exit(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for another PR, but it might be nice to never exit explicitly, so that we always reset via the drop guard.
| /// Process user input events until exit | ||
| fn process_events(&mut self, options: &Options) -> UResult<()> { | ||
| loop { | ||
| if !event::poll(Duration::from_millis(100))? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change the number of millis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original 10ms polling rate seems excessive for manual navigation. Wdyt?
| time = { version = "0.3.36" } | ||
| unicode-segmentation = "1.11.0" | ||
| unicode-width = "0.2.0" | ||
| utf-8 = "0.7.6" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this already unused? I can't find where you removed the usage of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I added the tempfile crate into uumore's Cargo file using cargo add, it automatically removed the utf-8 crate (along with selinux-sys) from the main Cargo file. I assumed it was cleaning up unused crates, but I can add it back if needed.
src/uu/more/src/more.rs
Outdated
| // Ensure we have enough lines loaded for display | ||
| self.read_until_line(self.upper_mark + self.content_rows)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this take the squeezed lines into account?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it does not. That is a good point. I will move this into the loop
src/uu/more/src/more.rs
Outdated
| // Display lines until we've filled the screen | ||
| let mut lines_printed = 0; | ||
| let mut index = self.upper_mark; | ||
| while lines_printed < self.content_rows && index < self.lines.len() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it could be a for loop over the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main loop condition is lines_printed < self.content_rows. We may not always read until self.lines.len(). The index variable is used after the loop to check for EOF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored the loop; hope it is clearer now.
d4bc6b3 to
7f5e2b9
Compare
7f5e2b9 to
c8c4f52
Compare
|
GNU testsuite comparison: |
close #6397
This PR improves
uu_moreby achieving constant memory overhead during initialization. Previously (#7680), we read the entire file once and store byte positions. We now do so incrementally using the following techniques:InputTypeenum shown above.Stdinalready implementsBufReadand we need the underlyingFiletype to get the file size for displaying the status.Pagerobject, we scan the only up to the start line (options.from_line). We store the lines read inlines, and the byte positions incumulative_line_sizesto display the file navigation status.from_linetakes precedence over pattern which is in line withmore. This means that if the specified pattern only exists beforefrom_line, it will not be found.self.lines. This means we scan the entire file at most once.self.upper_mark + self.content_rowshave been read into memory. Then, we iterate over the relevant indexes inself.linesto print.test_more.rstomore.rssince we cannot easily extract stdout from crossterm'sAlternateScreen.less.