-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String formatting is slow. Commands like Print should write strings directly and not use write!. #628
Comments
If it is faster, seems like an easy optimization. Although the point of write! is that it does not flush, but depending on the use of execute or queue it will be flushed manually. Note that BufWriter does some internal optimizations to get the best write speed. Maybe this is not desirable for a benchmark.
It's a feature rather than a bug, crossterm has to deal with ANSI codes that are written to the terminal or any fmt::Writer, it is allowing users to pass in any kind of writer like your BufWriter. |
For clarification:
Doesn't QueableCommand require writers that implement io::Write?
Isn't the flushing controlled by the io::Writer? Lines 180 to 184 in 9a50fd2
You obviously know this better than me, but queueing seems to roughly work like this: io_writer.queue() Lines 188 to 220 in 9a50fd2
Adapter is a wrapper around the io writer with a impl fmt::Write. And the reason why fmt::Write is used, I assume, is to use string formatting to build the strings with ANSI codes (which apparently is slow). The write! macro then does the formatting and calls the Adapters' write_str(), which writes to the io:writer. So it seems that:
Perhaps I haven't understood it. I'll have a look at the crossterm code again next week, if I have time. |
Yes it is
Yes,
Yes, some codes require input others don't. What about something like this: #[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct PrintIo(pub &'static str);
impl Command for PrintIo {
fn write_ansi(&self, f: &mut impl fmt::Write) -> fmt::Result {
f.write_str(self.0)
}
#[cfg(windows)]
fn execute_winapi(&self) -> Result<()> {
panic!("tried to execute Print command using WinAPI, use ANSI instead");
}
} This uses iowriter.queue(PrintIo("test")); (not tested) |
Thanks for the quick and good responses. As you yourself realized (I can see your edits); this doesn't work, because it still calls self.0.fmt: impl<T: Display> Command for Print<T> {
fn write_ansi(&self, f: &mut impl fmt::Write) -> fmt::Result {
f.write_str(&self.0.to_string())
}
...
}
impl<T: Display> Display for Print<T> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
self.0.fmt(f)
}
} But the newest version does work. I added it verbatim and ran the same benchmark as above, with the addition of PrintIo. Like before, Print was about as fast/slow as write!. Now PrintIo is faster than those, and about as fast as write_all. So it seems to work. Although it's difficult to tell without a good benchmark. By the way, the benchmark was meant to replicate a realistic use case, where I would use BuffWriter. Are there suggestions for a better benchmark? Note also that a new command might not be the best solution. If the results are correct, then it might be desirable to eliminate formatting in as much places as possible. These are the results (on my old and slow machine). They were comparable in for 4 consecutive runs and different n:
|
For completeness sake, here's the updated benchmark: #![feature(test)]
use crossterm::{
style::{Print, PrintIo},
terminal, QueueableCommand,
};
use std::io::{stdout, BufWriter, Write};
extern crate test;
use test::Bencher;
enum WriteCharacters {
Format,
Direct,
Crossterm,
CrosstermIo,
}
impl WriteCharacters {
/// Writes a character n times.
fn run(&self) {
// NOTE: make the buffer big enough to avoid flushing.
let mut bufwriter = BufWriter::with_capacity(100 * 1024, stdout());
bufwriter.queue(terminal::EnterAlternateScreen);
let text = "x";
let n = 1600 * 900;
for _ in 0..n {
match self {
Self::Crossterm => {
// using Print:
bufwriter.queue(Print(text));
}
Self::CrosstermIo => {
// using PrintIo:
bufwriter.queue(PrintIo(text));
}
Self::Format => {
// this is like Print and should give the same results:
write!(bufwriter, "{}", text);
}
Self::Direct => {
// this writes the string directly:
bufwriter.write_all(text.as_bytes());
}
}
}
bufwriter.queue(terminal::LeaveAlternateScreen);
bufwriter.flush();
}
}
#[bench]
fn with_crossterm(bh: &mut Bencher) {
bh.iter(|| {
WriteCharacters::Crossterm.run();
});
}
#[bench]
fn with_crossterm_new(bh: &mut Bencher) {
bh.iter(|| {
WriteCharacters::CrosstermIo.run();
});
}
#[bench]
fn with_format(bh: &mut Bencher) {
bh.iter(|| {
WriteCharacters::Format.run();
});
}
#[bench]
fn without_format(bh: &mut Bencher) {
bh.iter(|| {
WriteCharacters::Direct.run();
});
}
|
So having I just went and look through the code, it seems all types that can use Types that already have this:
Types that cannot be changed: Types that can be changed:
Type that require format:
|
To summarize: We have commands that use static strings (like ResetColor), Commands that take impl Display (like Print and PrintStyledContent) and commands that convert integers to a string (like MoveTo or SetAttribute). By Commands that cannot be changed, you mean Commands that should take impl Display? I can agree with that. This means it would be better to keep the more generic Print and introduce a less generic command like PrintStr. But I'm not very good at rust, but what about #[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct PrintStr<T: AsRef<str>>(pub T);
impl<T: AsRef<str>> Command for PrintStr<T> {
fn write_ansi(&self, f: &mut impl fmt::Write) -> fmt::Result {
f.write_str(self.0.as_ref())
} I tested the benchmark using a String, whereas I was using a static str before, with PrintStr and the resuls are the good. I also tried About the other commands: I would say that this is a bigger issue, let's stick to Print for now. Finally: |
Background
I was using Print to display the characters of a string one by one, instead of printing the entire string at once. This made me realize that using a lot of Print commands in a cycle is much slower than expected.
Problem
The flamegraph shows that most of the time is spent on core::fmt::write, which is called here (and in other commands):
crossterm/src/style.rs
Line 401 in 9a50fd2
It's known that the write macro (and all string formatting in rust) is considerably slower than writing directly:
rust-lang/rust#10761
rust-lang/rust#76490
The issue is that style::Print doesn't actually need the macro because it doesn't format the string. I suspect that a lot of the commands could avoid the macro and just write the string directly.
Note, that there was a proposed change to the format macro that addresses this case (but it seems to be dead):
rust-lang/rust#76531
Apparently even creating a string with consecutive push_str() is faster than formatting:
rust-lang/rust-clippy#6926
I want to make it clear that it's just an assumption that avoiding the macro will improve performance. Unfortunately the crossterm codebase is designed so much around fmt that I found it difficult to test.
The only quick test I could come up with is to just replicate the core issue. The code below writes a character n times to stdout, using either Print, write!, or write_all.
It looks like it's up to 2x as fast without the macro, but the results are inconsistent between runs.
The text was updated successfully, but these errors were encountered: