New format_args!() and fmt::Arguments implementation #148789

m-ou-se · 2025-11-10T14:30:15Z

This is a new implementation of format_args!() and fmt::Arguments. With this implementation, fmt::Arguments is only two pointers in size. (Instead of six, before.) This makes it the same size as a &str and makes it fit in a register pair.

This fmt::Arguments can store a &'static str without any indirection or additional storage. This means that simple cases like print_fmt(format_args!("hello")) are now just as efficient for the caller as print_str("hello"), as shown by this example:

code:

fn main() {
    println!("Hello, world!");
}

before:

main:
 sub     rsp, 56
 lea     rax, [rip + .Lanon_hello_world]
 mov     qword ptr [rsp + 8], rax
 mov     qword ptr [rsp + 16], 1
 mov     qword ptr [rsp + 24], 8
 xorps   xmm0, xmm0
 movups  xmmword ptr [rsp + 32], xmm0
 lea     rdi, [rsp + 8]
 call    qword ptr [rip + std::io::stdio::_print]
 add     rsp, 56
 ret

after:

main:
 lea     rsi, [rip + .Lanon_hello_world]
 mov     edi, 29
 jmp     qword ptr [rip + std::io::stdio::_print]

(panic!("Hello, world!"); shows a similar change.)

This implementation stores all static information as just a single (byte) string, without any indirection:

code:

format_args!("Hello, {name:-^20}!")

lowering before:

fmt::Arguments::new_v1_formatted(
    &["Hello, ", "!\n"],
    &args,
    &[
        Placeholder {
            position: 0usize,
            flags: 3355443245u32,
            precision: format_count::Implied,
            width: format_count::Is(20u16),
        },
    ],
)

lowering after:

fmt::Arguments::new(
    b"\x07Hello, \xc3-\x00\x00\xc8\x14\x00\x02!\n\x00",
    &args,
)

This saves a ton of pointers and simplifies the expansion significantly, but does mean that individual pieces (e.g. "Hello, " and "!\n") cannot be reused. (Those pieces are often smaller than a pointer to them, though, in which case reusing them is useless.)

The details of the new representation are documented in library/core/src/fmt/mod.rs.

Diagram of the data structure after this change:

A diagram showing the fmt::Arguments internal structure after the change. Most notably, it is only two pointers in size, and all the string data is all part of a single string, removing a level of indirection.

Original data structure

A diagram showing the fmt::Arguments internal structure before the change. Most notably, it consists of three slices (so six pointers in size), and one of the slices contains string slices (so another two pointers in size for each string part, and more indirection).

m-ou-se · 2025-11-10T14:33:10Z

@bors try @rust-timer queue

Experiment: New fmt::Arguments implementation (another one)

rust-bors · 2025-11-10T16:53:36Z

☀️ Try build successful (CI)
Build commit: 6e6ba94 (6e6ba949d24fbfbd9cd48ca4c98adf59fbd04482, parent: a7b3715826827677ca8769eb88dc8052f43e734b)

rust-timer · 2025-11-10T18:13:06Z

Finished benchmarking commit (6e6ba94): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.1%, 5.8%]	26
Regressions ❌ (secondary)	0.6%	[0.1%, 1.3%]	44
Improvements ✅ (primary)	-0.7%	[-4.3%, -0.1%]	109
Improvements ✅ (secondary)	-1.7%	[-38.2%, -0.0%]	93
All ❌✅ (primary)	-0.5%	[-4.3%, 5.8%]	135

Max RSS (memory usage)

Results (primary -1.5%, secondary -0.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.2%	[2.2%, 2.2%]	1
Regressions ❌ (secondary)	3.7%	[1.0%, 6.7%]	12
Improvements ✅ (primary)	-1.6%	[-6.0%, -0.5%]	31
Improvements ✅ (secondary)	-2.6%	[-7.9%, -0.7%]	25
All ❌✅ (primary)	-1.5%	[-6.0%, 2.2%]	32

Cycles

Results (primary -0.5%, secondary -4.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	4.8%	[3.4%, 6.2%]	2
Regressions ❌ (secondary)	8.8%	[2.6%, 18.8%]	6
Improvements ✅ (primary)	-3.1%	[-5.0%, -2.1%]	4
Improvements ✅ (secondary)	-10.3%	[-39.4%, -2.1%]	13
All ❌✅ (primary)	-0.5%	[-5.0%, 6.2%]	6

Binary size

Results (primary -0.7%, secondary -1.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.0%, 1.4%]	4
Regressions ❌ (secondary)	3.2%	[0.0%, 7.5%]	12
Improvements ✅ (primary)	-0.8%	[-3.3%, -0.0%]	129
Improvements ✅ (secondary)	-1.7%	[-23.6%, -0.0%]	123
All ❌✅ (primary)	-0.7%	[-3.3%, 1.4%]	133

Bootstrap: 476.631s -> 471.922s (-0.99%)
Artifact size: 391.32 MiB -> 388.56 MiB (-0.70%)

m-ou-se · 2025-11-10T18:19:30Z

Ooh that's pretty good :D

m-ou-se · 2025-11-10T19:55:06Z

Pretty much everything looks like a great improvement. Not only number of instructions executed, but also memory usage and binary size. 🎉

Only two significant negative results:

1. "image-0.25.6 opt incr-patched:println" with almost +6% instructions:u.

Looking at the detailed results, it looks like that's all LLVM. Probably because llvm got more optimization opportunities. That's not necessarily a bad thing.

2. The `fmt-write-str` runtime benchmark with over +12% instructions:u.

This could be concerning, but I can't seem to fully replicate it locally.

If I recompile and run this benchmark 100 times in both nightly and with this PR, I do get this interesting result though:

With the nightly compiler, the results vary, with many measurements clustered close to 25ms but also many around 40ms. With this PR, the results are very consistent, all clustered around 27ms. (Update: It's around 26ms now, after a minor optimization.)

So, the median result is worse, but the average is better.

My guess is that the indirection (a slice of string slices) can make things unpredictable, as the strings aren't always in the optimal place for caching. The lack of indirection in the new version then makes it much more predictable. This is just a guess though.

m-ou-se · 2025-11-11T16:36:19Z

@bors try @rust-timer queue

Experiment: New fmt::Arguments implementation (another one)

github-actions · 2025-11-13T02:41:20Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 0186755 (parent) -> 503dce3 (this PR)

Test differences

Show 202 test diffs

202 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 503dce33e2e2a5d2fe978b2723ab2a994cc27472 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

dist-apple-various: 3283.0s -> 4299.8s (+31.0%)
aarch64-apple: 9944.8s -> 7106.3s (-28.5%)
x86_64-rust-for-linux: 2505.4s -> 3146.5s (+25.6%)
dist-i686-mingw: 11724.5s -> 9515.8s (-18.8%)
x86_64-gnu-gcc: 3064.8s -> 3594.9s (+17.3%)
dist-x86_64-solaris: 4916.8s -> 5715.6s (+16.2%)
aarch64-gnu-debug: 3918.9s -> 4512.8s (+15.2%)
x86_64-gnu-tools: 3291.3s -> 3769.9s (+14.5%)
pr-check-1: 1489.6s -> 1700.0s (+14.1%)
i686-gnu-1: 7428.5s -> 8319.6s (+12.0%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2025-11-13T03:59:22Z

Finished benchmarking commit (503dce3): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

If the regression was expected or you think it can be justified,
please write a comment with sufficient written justification, and add
@rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
If you think that you know of a way to resolve the regression, try to create
a new PR with a fix for the regression.
If you do not understand the regression or you think that it is just noise,
you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.1%, 5.7%]	17
Regressions ❌ (secondary)	0.6%	[0.1%, 1.1%]	40
Improvements ✅ (primary)	-0.7%	[-4.4%, -0.1%]	120
Improvements ✅ (secondary)	-1.6%	[-38.5%, -0.0%]	106
All ❌✅ (primary)	-0.5%	[-4.4%, 5.7%]	137

Max RSS (memory usage)

Results (primary -1.5%, secondary -1.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	3.8%	[1.4%, 7.0%]	3
Regressions ❌ (secondary)	3.8%	[1.2%, 5.8%]	8
Improvements ✅ (primary)	-2.2%	[-6.0%, -0.6%]	23
Improvements ✅ (secondary)	-3.1%	[-7.3%, -0.6%]	29
All ❌✅ (primary)	-1.5%	[-6.0%, 7.0%]	26

Cycles

Results (primary -2.3%, secondary 0.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	5.2%	[3.4%, 6.9%]	2
Regressions ❌ (secondary)	7.6%	[1.6%, 18.5%]	19
Improvements ✅ (primary)	-4.4%	[-7.9%, -1.5%]	7
Improvements ✅ (secondary)	-11.0%	[-40.0%, -1.8%]	12
All ❌✅ (primary)	-2.3%	[-7.9%, 6.9%]	9

Binary size

Results (primary -0.8%, secondary -1.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.0%, 1.4%]	4
Regressions ❌ (secondary)	3.0%	[0.0%, 6.9%]	12
Improvements ✅ (primary)	-0.8%	[-3.3%, -0.0%]	129
Improvements ✅ (secondary)	-1.7%	[-23.6%, -0.0%]	123
All ❌✅ (primary)	-0.8%	[-3.3%, 1.4%]	133

Bootstrap: 476.356s -> 473.309s (-0.64%)
Artifact size: 391.04 MiB -> 388.34 MiB (-0.69%)

Expose fmt::Arguments::from_str as unstable. Now that rust-lang#148789 is merged, we can have a fmt::Arguments::from_str. I don't know if we want to commit to always having an implementation that allows for this, but we can expose it as unstable for now so we can play with it. Tracking issue: rust-lang#148905

Rollup merge of #148906 - m-ou-se:fmt-args-from-str, r=dtolnay Expose fmt::Arguments::from_str as unstable. Now that #148789 is merged, we can have a fmt::Arguments::from_str. I don't know if we want to commit to always having an implementation that allows for this, but we can expose it as unstable for now so we can play with it. Tracking issue: #148905

Relevant upstream PR: - rust-lang/rust#148789 (New format_args!() and fmt::Arguments implementation) Resolves: model-checking#4474

Relevant upstream PR: - rust-lang/rust#148789 (New format_args!() and fmt::Arguments implementation) Resolves: #4474 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

panstromek · 2025-11-19T13:08:58Z

Perf triage:

Improvements outweigh regressions.

@rustbot label: +perf-regression-triaged

Expose fmt::Arguments::from_str as unstable. Now that rust-lang#148789 is merged, we can have a fmt::Arguments::from_str. I don't know if we want to commit to always having an implementation that allows for this, but we can expose it as unstable for now so we can play with it. Tracking issue: rust-lang#148905

m-ou-se self-assigned this Nov 10, 2025

m-ou-se added the A-fmt Area: `core::fmt` label Nov 10, 2025

This comment has been minimized.

Sign in to view

rust-bors bot added a commit that referenced this pull request Nov 10, 2025

Auto merge of #148789 - m-ou-se:new-fmt-args-alt, r=<try>

6e6ba94

Experiment: New fmt::Arguments implementation (another one)

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 10, 2025

m-ou-se mentioned this pull request Nov 10, 2025

Tracking issue for improving std::fmt::Arguments and format_args!() #99012

Open

61 tasks

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Nov 10, 2025

m-ou-se mentioned this pull request Nov 11, 2025

Experiment: New fmt::Arguments implementation #148529

Closed

m-ou-se force-pushed the new-fmt-args-alt branch from 9f41692 to 349d2b5 Compare November 11, 2025 15:02

rustbot added the A-run-make Area: port run-make Makefiles to rmake.rs label Nov 11, 2025

m-ou-se force-pushed the new-fmt-args-alt branch from 349d2b5 to 5b58c66 Compare November 11, 2025 16:35

This comment has been minimized.

Sign in to view

rust-bors bot added a commit that referenced this pull request Nov 11, 2025

Auto merge of #148789 - m-ou-se:new-fmt-args-alt, r=<try>

155c5d4

Experiment: New fmt::Arguments implementation (another one)

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 11, 2025

m-ou-se changed the title ~~Experiment: New fmt::Arguments implementation (another one)~~ New format_args!() and fmt::Arguments implementation Nov 11, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 13, 2025

bors merged commit 503dce3 into rust-lang:main Nov 13, 2025
12 checks passed

rustbot added this to the 1.93.0 milestone Nov 13, 2025

This was referenced Nov 13, 2025

Include arguments to the precondition check in failure messages #134938

Draft

Compute jump threading opportunities in a single pass #142821

Merged

Stabilize char_max_len #145610

Merged

bors mentioned this pull request Nov 13, 2025

Ignore #[doc(hidden)] items when computing trimmed paths for printing #148623

Open

m-ou-se deleted the new-fmt-args-alt branch November 13, 2025 09:57

This comment was marked as off-topic.

Sign in to view

tautschnig added a commit to tautschnig/kani that referenced this pull request Nov 17, 2025

Upgrade Rust toolchain to 2025-11-16

eb908b8

Relevant upstream PR: - rust-lang/rust#148789 (New format_args!() and fmt::Arguments implementation) Resolves: model-checking#4474

tautschnig mentioned this pull request Nov 17, 2025

Upgrade Rust toolchain to 2025-11-16 model-checking/kani#4477

Merged

rustbot added the perf-regression-triaged The performance regression has been triaged. label Nov 19, 2025

RoloEdits mentioned this pull request Nov 22, 2025

feat: Inline Git Blame helix-editor/helix#13133

Open

meithecatte mentioned this pull request Nov 22, 2025

[AVR] core::fmt prints out-of-bounds memory on microcontrollers with more than 128KB of program memory #149223

Open

oxalica mentioned this pull request Nov 23, 2025

generated call to core::fmt::Arguments::new_const fails to constant fold #128709

Closed

itsjunetime mentioned this pull request Nov 28, 2025

Can't resolve new_v1_formatted associated type symbol after 2025-11-14 nightly toolchain rust-lang/rust-analyzer#21163

Open

New format_args!() and fmt::Arguments implementation #148789

New format_args!() and fmt::Arguments implementation #148789

Conversation

m-ou-se commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

m-ou-se commented Nov 10, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

rust-bors bot commented Nov 10, 2025

Uh oh!

This comment has been minimized.

rust-timer commented Nov 10, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

m-ou-se commented Nov 10, 2025

Uh oh!

m-ou-se commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. "image-0.25.6 opt incr-patched:println" with almost +6% instructions:u.

2. The fmt-write-str runtime benchmark with over +12% instructions:u.

Uh oh!

m-ou-se commented Nov 11, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

Uh oh!

github-actions bot commented Nov 13, 2025

Test differences

Job duration changes

Uh oh!

rust-timer commented Nov 13, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

panstromek commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

m-ou-se commented Nov 10, 2025 •

edited

Loading

m-ou-se commented Nov 10, 2025 •

edited

Loading

2. The `fmt-write-str` runtime benchmark with over +12% instructions:u.