Skip to content

Conversation

jsgf
Copy link
Contributor

@jsgf jsgf commented Oct 1, 2025

As part of Rust's move semantics, the compiler will generate memory copy operations to move objects about. These are generally pretty small, and the backend is good at optimizing them. But sometimes, if the type is large, they can end up being surprisingly expensive. In such cases, you might want to pass them by reference, or Box them up.

However, these moves are also invisible to profiling. At best they appear as a memcpy, but one memcpy is basically indistinguishable from another, and its very hard to know that 1) it's actually a compiler-generated copy, and 2) what type it pertains to.

This PR adds two new pseudo-intrinsic functions in core::intrinsics:

pub fn compiler_move<T, const SIZE: usize>(_src: *const T, _dst: *mut T);
pub fn compiler_copy<T, const SIZE: usize>(_src: *const T, _dst: *mut T);

These functions are never actually called however. A MIR transform pass -- instrument_moves.rs -- will locate all Operand::Move/Copy operations, and modify their source location to make them appear as if they had been inlined from compiler_move/_copy.

These functions have two generic parameters: the type being copied, and its size in bytes. This should make it very easy to identify which types are being expensive in your program (both in aggregate, and at specific hotspots). The size isn't strictly necessary since you can derive it from the type, but it's small and it makes it easier to understand what you're looking at.

This functionality is only enabled if you have debug info generation enabled, and also set the -Zinstrument-moves option.

It does not instrument all moves. By default it will only annotate ones for types over 64 bytes. The -Zinstrument-moves-size-limit specifies the size in bytes to start instrumenting for.

This has minimal impact on the size of debugging info. For rustc itself, the overall increase in librustc_driver*.so size is around .05% for 65 byte limit, 0.004% for 1025 byte limit, and a worst case of 0.6% for an 8 byte limit.

There's no effect on generated code, it only adds debug info.

As an example of a backtrace:

Breakpoint 1.3, __memcpy_avx512_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:255
255	ENTRY_P2ALIGN (MEMMOVE_SYMBOL (__memmove, unaligned_erms), 6)
(gdb) bt
 # 0  __memcpy_avx512_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:255
 # 1  0x0000555555590e7e in core::intrinsics::compiler_copy<[u64; 1000], 8000> () at library/core/src/intrinsics/mod.rs:10
 # 2  t::main () at t.rs:10

@rustbot
Copy link
Collaborator

rustbot commented Oct 1, 2025

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Oct 1, 2025
@rustbot
Copy link
Collaborator

rustbot commented Oct 1, 2025

r? @davidtwco

rustbot has assigned @davidtwco.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot

This comment has been minimized.

@jsgf jsgf force-pushed the move-copy-debug branch from 960847e to 5cc27d4 Compare October 1, 2025 02:22
@jsgf
Copy link
Contributor Author

jsgf commented Oct 1, 2025

I'm not really sure whether Statement::Assign's rvalues covers all the interesting cases or not. I'd like to make sure it covers:

  • parameter passing I was missing TerminatorKind::Call
  • returns
  • assignment
  • initialization
  • ...anything else?

@rust-log-analyzer

This comment has been minimized.

@jsgf jsgf force-pushed the move-copy-debug branch from 5cc27d4 to db64712 Compare October 1, 2025 04:37
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member

RalfJung commented Oct 1, 2025

Interesting idea!

Is it really worth distinguishing moves and copies? That doesn't make much of a difference for the runtime code, it's mostly a type system thing.

I'm not really sure whether Statement::Assign's rvalues covers all the interesting cases or not. I'd like to make sure it covers:

In MIR these will be spread across various places. The codegen backend would have an easier time centralizing all the ways in which operand uses get codegen'd as memcpy. But I am not sure if there's still a good way to adjust debuginfo there...

Isn't there a mutating MIR visitor you can use that traverses all operands?

Comment on lines 3318 to 3340
/// Compiler-generated move operation - never actually called.
/// Used solely for profiling and debugging visibility.
///
/// This function serves as a symbolic marker that appears in stack traces
/// when rustc generates move operations, making them visible in profilers.
/// The SIZE parameter encodes the size of the type being moved in the function name.
#[rustc_force_inline]
#[rustc_diagnostic_item = "compiler_move"]
pub fn compiler_move<T, const SIZE: usize>(_src: *const T, _dst: *mut T) {
unreachable!("compiler_move should never be called - it's only for debug info")
}

/// Compiler-generated copy operation - never actually called.
/// Used solely for profiling and debugging visibility.
///
/// This function serves as a symbolic marker that appears in stack traces
/// when rustc generates copy operations, making them visible in profilers.
/// The SIZE parameter encodes the size of the type being copied in the function name.
#[rustc_force_inline]
#[rustc_diagnostic_item = "compiler_copy"]
pub fn compiler_copy<T, const SIZE: usize>(_src: *const T, _dst: *mut T) {
unreachable!("compiler_copy should never be called - it's only for debug info")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't intrinsics so I don't think this is the best place for them. The file is already too big anyway.^^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I wasn't sure exactly where to put them. Originally I had the idea of actually making them real functions implementing copy & move in terms of calls to them, but that seemed more fiddly than it's worth.

@jsgf
Copy link
Contributor Author

jsgf commented Oct 1, 2025

Is it really worth distinguishing moves and copies? That doesn't make much of a difference for the runtime code, it's mostly a type system thing.

I think in practice it's useful - I've seen very large structures being made Copy just because all their fields allow it and then being copied around unexpectedly. It would be nice to be able to see that, and distinguish it from regular Move.

The codegen backend would have an easier time centralizing all the ways in which operand uses get codegen'd as memcpy. But I am not sure if there's still a good way to adjust debuginfo there...

That was actually my first attempt but I ended up with a stream of mysterious crashes/assertion failures from within the guts of llvm. Doing the manipulations at the MIR level turned out to be much more straightforward.

Isn't there a mutating MIR visitor you can use that traverses all operands?

I'll take another look.

@jsgf jsgf force-pushed the move-copy-debug branch from db64712 to af357e2 Compare October 1, 2025 08:10
@jsgf
Copy link
Contributor Author

jsgf commented Oct 1, 2025

@RalfJung

  • I created core::profiling to hold compiler_copy/compiler_move
  • I looked at the visit_operand visitor; it's appealing but it doesn't give access to change the containing source info.

@rust-log-analyzer

This comment has been minimized.

@jsgf
Copy link
Contributor Author

jsgf commented Oct 1, 2025

This has minimal impact on the size of debugging info. For rustc itself, the overall increase in librustc_driver*.so size is around .05% for 65 byte limit, 0.004% for 1025 byte limit, and a worst case of 0.6% for an 8 byte limit.

I was missing parameter passing moves the first time around, so it's a little larger now: about 0.2% for 65 byte, about (almost nothing) for 1024 and closer to 1% for 8 byte.

@jsgf jsgf force-pushed the move-copy-debug branch from af357e2 to 2ba9b47 Compare October 1, 2025 19:06
@rust-log-analyzer

This comment has been minimized.

As part of Rust's move semantics, the compiler will generate memory copy
operations to move objects about. These are generally pretty small, and
the backend is good at optimizing them. But sometimes, if the type is
large, they can end up being surprisingly expensive. In such cases, you
might want to pass them by reference, or Box them up.

However, these moves are also invisible to profiling. At best they
appear as a `memcpy`, but one memcpy is basically indistinguishable from
another, and its very hard to know that 1) it's actually a
compiler-generated copy, and 2) what type it pertains to.

This PR adds two new pseudo-functions in `core::profiling`:
```
pub fn compiler_move<T, const SIZE: usize>(_src: *const T, _dst: *mut T);
pub fn compiler_copy<T, const SIZE: usize>(_src: *const T, _dst: *mut T);
```
These functions are never actually called however. A MIR transform
pass -- `instrument_moves.rs` -- will locate all `Operand::Move`/`Copy`
operations, and modify their source location to make them appear as if
they had been inlined from `compiler_move`/`_copy`.

These functions have two generic parameters: the type being copied, and
its size in bytes. This should make it very easy to identify which types
are being expensive in your program (both in aggregate, and at specific
hotspots). The size isn't strictly necessary since you can derive it
from the type, but it's small and it makes it easier to understand what
you're looking at.

This functionality is only enabled if you have debug info generation
enabled, and also set the `-Zinstrument-moves` option.

It does not instrument all moves. By default it will only annotate ones
for types over 64 bytes. The `-Zinstrument-moves-size-limit` specifies
the size in bytes to start instrumenting for.

This has minimal impact on the size of debugging info. For rustc itself,
the overall increase in librustc_driver*.so size is around .05% for 65
byte limit, 0.004% for 1025 byte limit, and a worst case of 0.6% for an
8 byte limit.

There's no effect on generated code, it only adds debug info.

As an example of a backtrace:
```
Breakpoint 1.3, __memcpy_avx512_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:255
255	ENTRY_P2ALIGN (MEMMOVE_SYMBOL (__memmove, unaligned_erms), 6)
(gdb) bt
 # 0  __memcpy_avx512_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:255
 # 1  0x0000555555590e7e in core::profiling::compiler_copy<[u64; 1000], 8000> () at library/core/src/profiling.rs:27
 # 2  t::main () at t.rs:10
```
@jsgf jsgf force-pushed the move-copy-debug branch from 2ba9b47 to 0fa7acf Compare October 2, 2025 02:56
@rust-log-analyzer
Copy link
Collaborator

The job aarch64-gnu-llvm-20-1 failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)

---- [mir-opt] tests/mir-opt/instrument-moves/basic.rs stdout ----
8     }
9 
10     bb0: {
-         _2 = helper(move _1) -> [return: bb1, unwind unreachable];
+         _2 = helper(move _1) -> [return: bb1, unwind continue];
12     }
13 
14     bb1: {


thread '[mir-opt] tests/mir-opt/instrument-moves/basic.rs' panicked at src/tools/compiletest/src/runtest/mir_opt.rs:84:21:
Actual MIR output differs from expected MIR output /checkout/tests/mir-opt/instrument-moves/basic.test_call_arg.InstrumentMoves.after.mir
stack backtrace:
   5: __rustc::rust_begin_unwind
             at /rustc/bb624dcb4c8ab987e10c0808d92d76f3b84dd117/library/std/src/panicking.rs:698:5
   6: core::panicking::panic_fmt
             at /rustc/bb624dcb4c8ab987e10c0808d92d76f3b84dd117/library/core/src/panicking.rs:75:14
   7: <compiletest::runtest::TestCx>::run_revision
   8: compiletest::runtest::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
---- [mir-opt] tests/mir-opt/instrument-moves/basic.rs stdout end ----
---- [mir-opt] tests/mir-opt/instrument-moves/iter.rs stdout ----
2 
3 fn test_impl_trait_return() -> std::iter::Chain<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>, Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>> {
4     let mut _0: std::iter::Chain<std::iter::Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>, std::iter::Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>;
-     scope 1 (inlined make_large_iter) {
-         let mut _1: [u64; 5];
-         let mut _4: std::array::IntoIter<u64, 5>;
-         let mut _5: fn(u64) -> u64;
-         let mut _6: std::iter::Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>;
-         let mut _7: [u64; 5];
-         let mut _10: std::array::IntoIter<u64, 5>;
-         let mut _11: fn(u64) -> u64;
-         let mut _12: std::iter::Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>;
-         scope 2 (inlined array::iter::<impl IntoIterator for [u64; 5]>::into_iter) {
-             debug self => _1;
-             let _2: [std::mem::MaybeUninit<u64>; 5];
-             scope 3 {
-                 let _3: std::array::iter::iter_inner::PolymorphicIter<[std::mem::MaybeUninit<u64>; 5]>;
-                 scope 4 {
-                     scope 21 (inlined core::profiling::compiler_copy::<array::iter::iter_inner::PolymorphicIter<[MaybeUninit<u64>; 5]>, 48>) {
-                         scope 33 (inlined core::profiling::compiler_copy::<array::iter::iter_inner::PolymorphicIter<[MaybeUninit<u64>; 5]>, 48>) {
-                         }
-                     }
-                 }
-                 scope 5 (inlined ops::index_range::IndexRange::zero_to) {
-                 }
-                 scope 6 (inlined array::iter::iter_inner::PolymorphicIter::<[MaybeUninit<u64>; 5]>::new_unchecked) {
-                     scope 20 (inlined core::profiling::compiler_copy::<[MaybeUninit<u64>; 5], 40>) {
-                         scope 32 (inlined core::profiling::compiler_copy::<[MaybeUninit<u64>; 5], 40>) {
-                         }
-                     }
-                 }
-             }
-             scope 19 (inlined core::profiling::compiler_copy::<[u64; 5], 40>) {
-                 scope 31 (inlined core::profiling::compiler_copy::<[u64; 5], 40>) {
-                 }
-             }
-         }
-         scope 7 (inlined <std::array::IntoIter<u64, 5> as Iterator>::map::<u64, fn(u64) -> u64>) {
-             debug self => _4;
-             debug f => _5;
-             scope 8 (inlined Map::<std::array::IntoIter<u64, 5>, fn(u64) -> u64>::new) {
-                 scope 22 (inlined core::profiling::compiler_copy::<std::array::IntoIter<u64, 5>, 48>) {
-                     scope 34 (inlined core::profiling::compiler_copy::<std::array::IntoIter<u64, 5>, 48>) {
-                     }
-                 }
-             }
-         }
-         scope 9 (inlined array::iter::<impl IntoIterator for [u64; 5]>::into_iter) {
-             debug self => _7;
-             let _8: [std::mem::MaybeUninit<u64>; 5];
-             scope 10 {
-                 let _9: std::array::iter::iter_inner::PolymorphicIter<[std::mem::MaybeUninit<u64>; 5]>;
-                 scope 11 {
-                     scope 25 (inlined core::profiling::compiler_copy::<array::iter::iter_inner::PolymorphicIter<[MaybeUninit<u64>; 5]>, 48>) {
-                         scope 37 (inlined core::profiling::compiler_copy::<array::iter::iter_inner::PolymorphicIter<[MaybeUninit<u64>; 5]>, 48>) {
-                         }
-                     }
-                 }
-                 scope 12 (inlined ops::index_range::IndexRange::zero_to) {
-                 }
-                 scope 13 (inlined array::iter::iter_inner::PolymorphicIter::<[MaybeUninit<u64>; 5]>::new_unchecked) {
-                     scope 24 (inlined core::profiling::compiler_copy::<[MaybeUninit<u64>; 5], 40>) {
-                         scope 36 (inlined core::profiling::compiler_copy::<[MaybeUninit<u64>; 5], 40>) {
-                         }
-                     }
-                 }
-             }
-             scope 23 (inlined core::profiling::compiler_copy::<[u64; 5], 40>) {
-                 scope 35 (inlined core::profiling::compiler_copy::<[u64; 5], 40>) {
-                 }
-             }
-         }
-         scope 14 (inlined <std::array::IntoIter<u64, 5> as Iterator>::map::<u64, fn(u64) -> u64>) {
-             debug self => _10;
-             debug f => _11;
-             scope 15 (inlined Map::<std::array::IntoIter<u64, 5>, fn(u64) -> u64>::new) {
-                 scope 26 (inlined core::profiling::compiler_copy::<std::array::IntoIter<u64, 5>, 48>) {
-                     scope 38 (inlined core::profiling::compiler_copy::<std::array::IntoIter<u64, 5>, 48>) {
-                     }
-                 }
-             }
-         }
-         scope 16 (inlined <Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64> as Iterator>::chain::<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>) {
-             debug self => _6;
-             debug other => _12;
-             scope 17 (inlined std::iter::Chain::<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>, Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>::new) {
-                 debug a => _6;
-                 debug b => _12;
-                 let mut _13: std::option::Option<std::iter::Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>;
-                 let mut _14: std::option::Option<std::iter::Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>;
-                 scope 27 (inlined core::profiling::compiler_copy::<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>, 52>) {
-                     scope 39 (inlined core::profiling::compiler_copy::<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>, 52>) {
-                     }
-                 }
-                 scope 28 (inlined core::profiling::compiler_copy::<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>, 52>) {
-                     scope 40 (inlined core::profiling::compiler_copy::<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>, 52>) {
-                     }
-                 }
-                 scope 29 (inlined core::profiling::compiler_move::<Option<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>, 52>) {
-                     scope 30 (inlined core::profiling::compiler_move::<Option<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>, 52>) {
-                         scope 41 (inlined core::profiling::compiler_move::<Option<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>, 52>) {
-                             scope 42 (inlined core::profiling::compiler_move::<Option<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>, 52>) {
-                             }
-                         }
-                     }
-                 }
-             }
-             scope 18 (inlined <Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64> as IntoIterator>::into_iter) {
-                 debug self => _12;
-             }
-         }
-     }
114 
115     bb0: {
-         StorageLive(_12);
-         StorageLive(_6);
-         StorageLive(_4);
-         StorageLive(_1);
-         _1 = [const 1_u64, const 2_u64, const 3_u64, const 4_u64, const 5_u64];
-         StorageLive(_2);
-         StorageLive(_3);
-         _2 = copy _1 as [std::mem::MaybeUninit<u64>; 5] (Transmute);
-         _3 = array::iter::iter_inner::PolymorphicIter::<[MaybeUninit<u64>; 5]> { alive: const ops::index_range::IndexRange {{ start: 0_usize, end: 5_usize }}, data: copy _2 };
-         _4 = std::array::IntoIter::<u64, 5> { inner: copy _3 };
-         StorageDead(_3);
-         StorageDead(_2);
-         StorageDead(_1);
-         StorageLive(_5);
-         _5 = make_large_iter::double as fn(u64) -> u64 (PointerCoercion(ReifyFnPointer, AsCast));
-         _6 = Map::<std::array::IntoIter<u64, 5>, fn(u64) -> u64> { iter: copy _4, f: copy _5 };
-         StorageDead(_5);
-         StorageDead(_4);
-         StorageLive(_10);
-         StorageLive(_7);
-         _7 = [const 6_u64, const 7_u64, const 8_u64, const 9_u64, const 10_u64];
-         StorageLive(_8);
-         StorageLive(_9);
-         _8 = copy _7 as [std::mem::MaybeUninit<u64>; 5] (Transmute);
-         _9 = array::iter::iter_inner::PolymorphicIter::<[MaybeUninit<u64>; 5]> { alive: const ops::index_range::IndexRange {{ start: 0_usize, end: 5_usize }}, data: copy _8 };
-         _10 = std::array::IntoIter::<u64, 5> { inner: copy _9 };
-         StorageDead(_9);
-         StorageDead(_8);
-         StorageDead(_7);
-         StorageLive(_11);
-         _11 = make_large_iter::double as fn(u64) -> u64 (PointerCoercion(ReifyFnPointer, AsCast));
-         _12 = Map::<std::array::IntoIter<u64, 5>, fn(u64) -> u64> { iter: copy _10, f: copy _11 };
-         StorageDead(_11);
-         StorageDead(_10);
-         StorageLive(_13);
-         _13 = Option::<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>::Some(copy _6);
-         StorageLive(_14);
-         _14 = Option::<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>>::Some(copy _12);
-         _0 = std::iter::Chain::<Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>, Map<std::array::IntoIter<u64, 5>, fn(u64) -> u64>> { a: move _13, b: move _14 };
-         StorageDead(_14);
-         StorageDead(_13);
-         StorageDead(_6);
-         StorageDead(_12);
-         return;
+         _0 = make_large_iter() -> [return: bb1, unwind continue];
160     }
- }
162 
- ALLOC0 (size: 8, align: 4) {
-     00 00 00 00 05 00 00 00                         │ ........
+     bb1: {
+         return;
+     }
165 }
166 


thread '[mir-opt] tests/mir-opt/instrument-moves/iter.rs' panicked at src/tools/compiletest/src/runtest/mir_opt.rs:84:21:
Actual MIR output differs from expected MIR output /checkout/tests/mir-opt/instrument-moves/iter.test_impl_trait_return.InstrumentMoves.after.mir
stack backtrace:
   5: __rustc::rust_begin_unwind
             at /rustc/bb624dcb4c8ab987e10c0808d92d76f3b84dd117/library/std/src/panicking.rs:698:5
   6: core::panicking::panic_fmt
             at /rustc/bb624dcb4c8ab987e10c0808d92d76f3b84dd117/library/core/src/panicking.rs:75:14

@jsgf
Copy link
Contributor Author

jsgf commented Oct 3, 2025

Hm, the mir tests seem very brittle. They were clean locally.

@saethlin
Copy link
Member

saethlin commented Oct 3, 2025

All codegen tests have this bittleness. We build all codegen tests (mir-opt, codegen, assembly) suites against a sysroot which is compiled with whatever flags are set in the user-provided profile. The mir-opt suite tends to get blamed because we check in much more of the MIR than just the FileCheck annotations.

I suspect your test needs a //@ ignore-std-debug-assertions. The particular diff above is because the inliner changed behavior because the debug assertions changed the inlining cost, but even if you stabilized the inliner the MIR would depend on debug assertions.

@saethlin
Copy link
Member

saethlin commented Oct 3, 2025

Note that annotation doesn't completely fix the problem; it just means that to bless this test you need to have debug-assertions-std = false in your bootstrap.toml, and CI won't run the test at all in jobs that enable std debug assertions.

@RalfJung
Copy link
Member

RalfJung commented Oct 3, 2025

Is "instrument" the best term here? Usually that means to actually run some extra code for the to-be-instrumented operation, doesn't it? This here is just adding debuginfo.

Do new -Z flags need MCP? Judging from https://forge.rust-lang.org/compiler/proposals-and-stabilization.html#compiler-flags and assuming that this is meant not just for internal use by rustc developers, I think the answer is "yes".

@nnethercote
Copy link
Contributor

Can -Zinstrument-moves and -Zinstrument-moves-size-limit be combined?

How can the added debuginfo be used to produce useful profiling information?

@jsgf
Copy link
Contributor Author

jsgf commented Oct 3, 2025

@RalfJung:

Is "instrument" the best term here? Usually that means to actually run some extra code for the to-be-instrumented operation, doesn't it? This here is just adding debuginfo.

Annotate?

Do new -Z flags need MCP? Judging from https://forge.rust-lang.org/compiler/proposals-and-stabilization.html#compiler-flags and assuming that this is meant not just for internal use by rustc developers, I think the answer is "yes".

OK, I'll kick that off.

@nnethercote:

Can -Zinstrument-moves and -Zinstrument-moves-size-limit be combined?

I guess in principle, but I was thinking that the default might not necessarily be a constant. For example, it could maybe use the target info to select the cache-line size, or some size threshold.

How can the added debuginfo be used to produce useful profiling information?

The idea is that if you have a profiler sampling the pc/rip then it can use the debug info to unwind the stack frames to identify where it has sampled. This way, assuming the unwinder understands inlined functions, you'll be able to see the core::profiling::compiler_move "call", along with the type information, and be able to both overall bucket how much time you're spending on moving/copying a given type T in aggregate, and work out in a hotspot what objects are being moved around.

I still need to validate this in practice. I've managed to use gdb to show me stacks through these generated frames by setting breakpoints on memcpy (ie akin to poor man's profiler technique for auditing moves) but I still need to try profiling a non-trivial codebase.

Actually rustc itself is a good candidate of course. Should I just do something like profile record ./x.py build or is there a better way to set that up?

@jsgf
Copy link
Contributor Author

jsgf commented Oct 3, 2025

I suspect your test needs a //@ ignore-std-debug-assertions. The particular diff above is because the inliner changed behavior because the debug assertions changed the inlining cost, but even if you stabilized the inliner the MIR would depend on debug assertions.

Ah I see, that explains it. I'll fix that up.

@RalfJung
Copy link
Member

RalfJung commented Oct 3, 2025

Annotate?

Hm, maybe. Or maybe something specifically involving debuginfo?

@jsgf
Copy link
Contributor Author

jsgf commented Oct 6, 2025

I ended up renaming it to --annotate-moves and using a single option:

  • --annotate-moves - enable with default limit
  • --annotate-moves=true/false/on/off - explicitly enable/disable
  • --annotate-moves=1024 - enable with a specific size limit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants