Skip to content

Overhead

Clark Gaebel edited this page Apr 23, 2022 · 2 revisions

In our experience, magic-trace tends to make applications 2%-10% slower, and the application will stall for ~10us when magic-trace takes a snapshot. All in, I think it's fair to say that magic-trace has less overhead than perf -g, and more overhead than perf -glbr.

Overhead comes from two things: memory bandwidth consumption and a breakpoint when magic-trace takes a snapshot.

Memory bandwidth

The "2%-10%" overhead mostly (maybe entirely?) comes from Intel PT's memory bandwidth usage.

In our experience, Intel PT (and therefore magic-trace) uses hundreds of Mbps of memory bandwidth to construct its traces. This is usually fine; Intel PT pauses tracing if it notices that the trace would saturate memory bandwidth. Momentarily saturating memory bandwidth is the number one reason people see "Decode Errors" in their traces.

You can decrease magic-trace's memory bandwidth consumption by decreasing the timing resolution.

Breakpoint

When magic-trace takes its snapshot, it interrupts the application for ~10us. To prevent that overhead from affecting users of your application, we recommend triggering snapshots off of a function that's called after completely servicing a user's request.