Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment: fmt::Arguments as closure #101568

Closed
wants to merge 25 commits into from

Conversation

m-ou-se
Copy link
Member

@m-ou-se m-ou-se commented Sep 8, 2022

This is part of #99012

This implements the "closure idea" for fmt::Arguments.

For now it uses a simple enum { Fn(&'a dyn Fn), StaticStr(&'static str) } to handle the static str case for as_str(). A slightly more optimized version could reduce the size of fmt::Arguments to two pointers with some tricks.

Includes #100996 and #101569

With this, println!("Hello, {0} {0:x} {0:#x}!", 100) expands to:

_print(Arguments::new(&match (&100,) {
    _args => |w: &mut dyn Write| -> Result {
        w.write_str("Hello, ")?;
        Display::fmt(_args.0, &mut Formatter::new(w))?;
        w.write_str(" ")?;
        LowerHex::fmt(_args.0, &mut Formatter::new(w))?;
        w.write_str(" ")?;
        LowerHex::fmt(
            _args.0,
            &mut Formatter::new(w)
                .with_options(4u32, ' ', Alignment::Unknown, None, None)
        )?;
        w.write_str("!\n")?;
        Ok(())
    },
}))

@m-ou-se m-ou-se added T-libs Relevant to the library team, which will review and decide on the PR/issue. S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. labels Sep 8, 2022
@m-ou-se m-ou-se self-assigned this Sep 8, 2022
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 8, 2022
@m-ou-se m-ou-se removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 8, 2022
@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 8, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 8, 2022
@bors
Copy link
Contributor

bors commented Sep 8, 2022

⌛ Trying commit cad2895eacd510f7b6474a85f7ff4d777fcb7368 with merge b1b62d59ec191dbd9702e1435014673d12e3d3df...

@m-ou-se m-ou-se force-pushed the format-args-closure branch 3 times, most recently from 562b8c7 to 05fefb7 Compare September 8, 2022 10:47
@rust-log-analyzer

This comment has been minimized.

@joshtriplett
Copy link
Member

At the risk of microoptimization: might it be worth recognizing the extremely common case of a single-character write_str and turning it into a call to Write::write_char(f, ' ') (which could potentially be marked #[inline] so that it directly invokes write_char on f.buf)? Would that potentially be a net win?

@joshtriplett
Copy link
Member

I'm incredibly excited about this, and in particular I'm excited for being able to give the optimizer more optimization opportunities.

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 13, 2022

@joshtriplett write_char's default implementation is self.write_str(c.encode_utf8(&mut [0; 4])), so I'm not sure if that actually helps.

@m-ou-se

This comment was marked as resolved.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

The number of line drawing characters depends on the number of digits
in the allocation number. This removes the characters to avoid
spurious failures.
@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 14, 2022

Okay let's try a perf run..

I'm not expecting great results, because this just tests rustc itself. The approach in this PR results in a closure per format_args!(), meaning more codegen and more work for the optimizer.

We'll need separate benchmarking for runtime performance.

The try build will also allow folks to easily test out this implementation using rustup-toolchain-install-master.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@bors
Copy link
Contributor

bors commented Sep 14, 2022

⌛ Trying commit e45a26c with merge 4edeac5d8399ad2a61ee852f523d95f5be83429a...

@bors
Copy link
Contributor

bors commented Sep 14, 2022

☀️ Try build successful - checks-actions
Build commit: 4edeac5d8399ad2a61ee852f523d95f5be83429a (4edeac5d8399ad2a61ee852f523d95f5be83429a)

@rust-timer
Copy link
Collaborator

Queued 4edeac5d8399ad2a61ee852f523d95f5be83429a with parent c97922d, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (4edeac5d8399ad2a61ee852f523d95f5be83429a): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
25.9% [0.2%, 1285.0%] 200
Regressions ❌
(secondary)
4.4% [0.2%, 60.1%] 78
Improvements ✅
(primary)
-1.5% [-3.5%, -0.4%] 14
Improvements ✅
(secondary)
-1.1% [-4.2%, -0.3%] 27
All ❌✅ (primary) 24.1% [-3.5%, 1285.0%] 214

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
10.3% [1.0%, 50.2%] 144
Regressions ❌
(secondary)
6.9% [0.8%, 26.0%] 27
Improvements ✅
(primary)
-6.6% [-10.9%, -2.2%] 2
Improvements ✅
(secondary)
-3.3% [-4.7%, -1.9%] 2
All ❌✅ (primary) 10.1% [-10.9%, 50.2%] 146

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
38.6% [1.3%, 1773.3%] 165
Regressions ❌
(secondary)
15.3% [1.6%, 80.1%] 32
Improvements ✅
(primary)
-2.7% [-3.3%, -2.1%] 2
Improvements ✅
(secondary)
-3.1% [-4.0%, -2.3%] 8
All ❌✅ (primary) 38.1% [-3.3%, 1773.3%] 167

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot added perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Sep 14, 2022
@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 14, 2022

Yup, that's exactly as expected. ^^

The incr-patched: println ones go up a lot, since compiling a format string is now more expensive. Similarly full for crates with a lot of formatting also go up significantly, such as cargo by about 23%.

There might be ways to improve the compiler performance, but for now I'm not going to address this. The purpose of this experimental PR for now is to see what the effect of such an approach can be on runtime performance and binary size.

Only if we conclude this might be the best approach for runtime performance and binary size, is it worth looking into compiler performance.

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 14, 2022

Interestingly, some of the doc jobs show a significant improvement, specifically in render_item. That might be a sign that it can indeed have a very positive impact on runtime performance. :)

Nemo157 added a commit to Nemo157/stylish-rs that referenced this pull request Sep 22, 2022
Nemo157 added a commit to Nemo157/stylish-rs that referenced this pull request Sep 22, 2022
@m-ou-se m-ou-se removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 18, 2022
@thomcc thomcc added the A-fmt Area: `std::fmt` label Oct 22, 2022
@Dylan-DPC
Copy link
Member

Closing this as it was an experiment.

@Dylan-DPC
Copy link
Member

l

@Dylan-DPC Dylan-DPC closed this May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-fmt Area: `std::fmt` perf-regression Performance regression. S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants