Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try not to use fmt::Formatters #86

Merged
merged 6 commits into from
Jul 24, 2024

Conversation

Kijewski
Copy link
Collaborator

@Kijewski Kijewski commented Jul 22, 2024

When you write an fmt::Displayable type into a fmt::Write sink, a fmt::Formatter is used to facilitate the writing. It stores a &dyn fmt::Write reference to the sink, so less duplicated byte code is generated, but any possible optimizations become void.

This PR lets rinja try to avoid using the normal write!() machinery. Indeed, no write!() call is made directly anymore in the generated code, but only though a WriteWritable helper trait. This aids the runtime performance quite a bit:

$ cd testing/benches
$ cargo bench --bench all

Big table               time:   [405.60 µs 406.46 µs 407.33 µs]
                        change: [-12.463% -12.239% -12.022%] (p = 0.00 < 0.05)
                        Performance has improved.

Teams                   time:   [200.00 ns 200.30 ns 200.61 ns]
                        change: [-51.953% -51.462% -51.036%] (p = 0.00 < 0.05)
                        Performance has improved.

Todo:

  • update tests (e.g. ryu does not omits trailing .0s for floats)
  • test properly
  • add comments

@Kijewski
Copy link
Collaborator Author

Hm, filters (including any explicit |escape) still invariantly causes writes through fmt::Formatter. I guess I can ignore the case for now because the this is a good speed-up for a start?

@Kijewski Kijewski force-pushed the pr-no-formatter branch 2 times, most recently from e38a0e3 to 52e1f5b Compare July 22, 2024 18:11
@Kijewski Kijewski marked this pull request as ready for review July 22, 2024 18:11
@Kijewski Kijewski requested a review from GuillaumeGomez July 22, 2024 18:26
@Kijewski Kijewski force-pushed the pr-no-formatter branch 2 times, most recently from 9b7d80f to 68d4e50 Compare July 22, 2024 18:29
@@ -490,7 +490,7 @@ fn test_num_literals() {
let template = NumLiterals;
assert_eq!(
template.render().unwrap(),
"[90, -90, 90, 2, 56, 240, 10.5, 10.5, 100000000000, 105000000000]\n1\n12",
"[90, -90, 90, 2, 56, 240, 10.5, 10.5, 100000000000.0, 105000000000.0]\n1\n12",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ryu adds a trailing .0 for integral floats. IMO ryu gets it more right than rust's defaults, but we could also strip a trailing .0 to remove the difference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters much. If people want to change it, they can always using the format filter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum... Thinking some more about it, people will expect us to use the Rust implementation to get the same output. I think it's problematic that ryu doesn't always display the same thing.

Here what I suggest: I talked with Mara and she said that we could use ryu's algorithm directly in std to improve performance. I think it's a better way forward rather than having our own display, the performance difference is not big enough to be worth imo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance is much more impacted by the removal of the &dyn Write, the speed-up through ryu is minuscule in comparison. It's just the first implementation that came to mind.

But I think it's seldom that people will want to print floats, and then most likely with explicit formatting (price|fmt(".02")), so it won't matter much if I remove this specialization.

Comment on lines -1147 to -1270
fn named_expression(
&mut self,
buf_expr: &mut Buffer,
buf_format: &mut Buffer,
expr: String,
wrapped: DisplayWrap,
cacheable: bool,
expr_cache: &mut HashMap<String, usize>,
) -> usize {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inlined this function. In my opinion the new functions is small enough by itself.

@Kijewski Kijewski force-pushed the pr-no-formatter branch 2 times, most recently from 7c365b1 to a15b0e1 Compare July 23, 2024 12:27
}

const _: () = {
macro_rules! specialize_ref {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please a comment explaining why we need this macro and what it does.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used just once just below. Should I maybe rename it into impl_ref for impl_for_ref to make the name more speaking?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you prefer. Either a comment or a renaming to make it clear without needing to read the code. ;)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Either … or …", well, I did both :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha. Well, it's the best solution I think. :D

@Kijewski Kijewski force-pushed the pr-no-formatter branch 2 times, most recently from 7e9935d to 1f6a782 Compare July 24, 2024 13:36
if !self.discard {
src.append_to(&mut self.buf);
if !self.last_was_write_str {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change in particular will improve things SO MUCH. It's a really good idea!

Kijewski added 6 commits July 24, 2024 15:43
When you write an `fmt::Display`able type into a `fmt::Write` sink, a
`fmt::Formatter` is used to facilitate the writing. It stores a
`&dyn fmt::Write` reference to the sink, so less duplicated byte code
is generated, but any possible optimizations before void.

This PR lets rinja try to avoid using the normal `write!()` machinery.
Indeed, not `write!()` call is made directly anymore in the generated
code, but only though a `WriteWritable` helper trait. This aids the
runtime performance quite a bit:

```text
$ cd testing/benches
$ cargo bench --bench all

Big table               time:   [405.60 µs 406.46 µs 407.33 µs]
                        change: [-12.463% -12.239% -12.022%] (p = 0.00 < 0.05)
                        Performance has improved.

Teams                   time:   [200.00 ns 200.30 ns 200.61 ns]
                        change: [-51.953% -51.462% -51.036%] (p = 0.00 < 0.05)
                        Performance has improved.
```
```text
$ cd rinja_derive_standalone
$ cargo bench

hello_world             time:   [50.420 µs 50.529 µs 50.637 µs]
                        change: [-0.6238% -0.3448% -0.0695%] (p = 0.02 < 0.05)
                        Change within noise threshold.

item_info.html          time:   [58.807 µs 59.108 µs 59.408 µs]
                        change: [-2.4731% -2.0541% -1.6532%] (p = 0.00 < 0.05)
                        Performance has improved.

item_union.html         time:   [182.90 µs 183.97 µs 185.73 µs]
                        change: [-2.6740% -1.8652% -0.7734%] (p = 0.00 < 0.05)
                        Change within noise threshold.

page.html               time:   [767.05 µs 768.05 µs 769.20 µs]
                        change: [-4.5916% -4.2764% -3.9026%] (p = 0.00 < 0.05)
                        Performance has improved.

print_item.html         time:   [154.97 µs 155.26 µs 155.59 µs]
                        change: [-2.0385% -1.7364% -1.4351%] (p = 0.00 < 0.05)
                        Performance has improved.

short_item_info.html    time:   [126.35 µs 126.49 µs 126.67 µs]
                        change: [-6.5708% -6.1935% -5.8314%] (p = 0.00 < 0.05)
                        Performance has improved.

sidebar.html            time:   [196.25 µs 196.68 µs 197.14 µs]
                        change: [-4.9096% -4.4107% -3.9467%] (p = 0.00 < 0.05)
                        Performance has improved.

source.html             time:   [112.69 µs 113.14 µs 113.76 µs]
                        change: [-3.4552% -3.1557% -2.8072%] (p = 0.00 < 0.05)
                        Performance has improved.

type_layout.html        time:   [160.05 µs 160.30 µs 160.59 µs]
                        change: [-11.126% -10.419% -9.2877%] (p = 0.00 < 0.05)
                        Performance has improved.

type_layout_size.html   time:   [66.006 µs 66.340 µs 66.664 µs]
                        change: [-2.3099% -1.3973% +0.0223%] (p = 0.01 < 0.05)
                        Change within noise threshold.
```

const _: () = {
// implement FastWritable for a list of reference wrapper types to FastWritable+?Sized
macro_rules! impl_for_ref {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@GuillaumeGomez
Copy link
Contributor

Looks all good to me, thanks!

@GuillaumeGomez GuillaumeGomez merged commit bbba232 into rinja-rs:master Jul 24, 2024
17 checks passed
@Kijewski Kijewski deleted the pr-no-formatter branch July 24, 2024 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants