-
Notifications
You must be signed in to change notification settings - Fork 731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tracing: move macro callsite impls out of macro expansion #869
Conversation
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
We could factor more code out of the macros by encapsulating the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but just asking some corrections on the doc (private) comments.
Thanks, @davidbarsky, that _was_, in fact, what I meant... Co-authored-by: David Barsky <me@davidbarsky.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😎
Fixed - Fixed a bug where `LevelFilter::OFF` (and thus also the `static_max_level_off` feature flag) would enable *all* traces, rather than *none* (#853) - **log**: Fixed `tracing` macros and `Span`s not checking `log::max_level` before emitting `log` records (#870) Changed - **macros**: Macros now check the global max level (`LevelFilter::current`) before the per-callsite cache when determining if a span or event is enabled. This significantly improves performance in some use cases (#853) - **macros**: Simplified the code generated by macro expansion significantly, which may improve compile times and/or `rustc` optimizatation of surrounding code (#869, #869) - **macros**: Macros now check the static max level before checking any runtime filtering, improving performance when a span or event is disabled by a `static_max_level_XXX` feature flag (#868) - `LevelFilter` is now a re-export of the `tracing_core::LevelFilter` type, it can now be used interchangably with the versions in `tracing-core` and `tracing-subscriber` (#853) - Significant performance improvements when comparing `LevelFilter`s and `Level`s (#853) - Updated the minimum `tracing-core` dependency to 0.1.12 (#853) Added - **macros**: Quoted string literals may now be used as field names, to allow fields whose names are not valid Rust identifiers (#790) - **docs**: Several documentation improvements (#850, #857, #841) - `LevelFilter::current()` function, which returns the highest level that any subscriber will enable (#853) - `Subscriber::max_level_hint` optional trait method, for setting the value returned by `LevelFilter::current()` (#853) Thanks to new contributors @cuviper, @ethanboxx, @ben0x539, @dignati, @colelawrence, and @rbtcollins for helping out with this release! Signed-off-by: Eliza Weisman <eliza@buoyant.io>
### Fixed - Fixed a bug where `LevelFilter::OFF` (and thus also the `static_max_level_off` feature flag) would enable *all* traces, rather than *none* (#853) - **log**: Fixed `tracing` macros and `Span`s not checking `log::max_level` before emitting `log` records (#870) ### Changed - **macros**: Macros now check the global max level (`LevelFilter::current`) before the per-callsite cache when determining if a span or event is enabled. This significantly improves performance in some use cases (#853) - **macros**: Simplified the code generated by macro expansion significantly, which may improve compile times and/or `rustc` optimizatation of surrounding code (#869, #869) - **macros**: Macros now check the static max level before checking any runtime filtering, improving performance when a span or event is disabled by a `static_max_level_XXX` feature flag (#868) - `LevelFilter` is now a re-export of the `tracing_core::LevelFilter` type, it can now be used interchangably with the versions in `tracing-core` and `tracing-subscriber` (#853) - Significant performance improvements when comparing `LevelFilter`s and `Level`s (#853) - Updated the minimum `tracing-core` dependency to 0.1.12 (#853) ### Added - **macros**: Quoted string literals may now be used as field names, to allow fields whose names are not valid Rust identifiers (#790) - **docs**: Several documentation improvements (#850, #857, #841) - `LevelFilter::current()` function, which returns the highest level that any subscriber will enable (#853) - `Subscriber::max_level_hint` optional trait method, for setting the value returned by `LevelFilter::current()` (#853) Thanks to new contributors @cuviper, @ethanboxx, @ben0x539, @dignati, @colelawrence, and @rbtcollins for helping out with this release! Signed-off-by: Eliza Weisman <eliza@buoyant.io>
This picks up upstream changes tokio-rs/tracing#853, tokio-rs/tracing#868, and tokio-rs/tracing#869 which improve performance in some use cases. The overhead removed by these changes may already be amortized enough in the proxy that it's not a problem, but it seems worth picking up regardless.
This picks up upstream changes tokio-rs/tracing#853, tokio-rs/tracing#868, and tokio-rs/tracing#869 which improve performance in some use cases. The overhead removed by these changes may already be amortized enough in the proxy that it's not a problem, but it seems worth picking up regardless. Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Motivation
Currently, every
tracing
macro generates a new implementation of theCallsite
trait for a zero-sized struct created for that particularcallsite. This callsite accesses several statics defined in the macro
expansion.
This means that each tracing macro expands to a lot of code — check
out the
cargo expand
output:More code in the macro expansion means more code in the function
invoking the macro, which may make that function harder for
rustc
tooptimize. This effects the performance of other code in the function,
not the
tracing
code, so this isn't necessarily visible intracing
'smicrobenchmarks, which only contain
tracing
code.In
rustc
itself, there is a small but noticeable performance impactfrom switching from
log
totracing
even after making changes thatshould make the filtering overhead equivalent:
rust-lang/rust#74726 (comment).
This appears to be due to more complex generated code impacting
optimizer behavior.
Solution
This branch moves the callsite generated by each macro out of the macro
expansion and into a single
MacroCallsite
private API type in the__macro_support
module. Instead of creating a zero-sizedCallsite
static and multiple statics for the
Metadata
, theOnce
cell forregistration, and the
Interest
atomic, these are all now fields on theCallsite
struct. This shouldn't result in any real change, but makesthe implementation simpler. All the hot filtering functions on
MacroCallsite
are#[inline(always)]
, so we shouldn't be adding stackframes to code that was previously generated in the macro expansion.
After making this change, the expanded output is about half as long
as it was before:
This change appears to fix most of the remaining
rustc
performanceregressions:
rust-lang/rust#74726 (comment)
Additionally, it has some other side benefits. I imagine it probably
improves compile times a bit for crates using
tracing
(although Ihaven't tested this), since the compiler is generating fewer callsite
implementations. Reducing the number of branches in the macro expansion
probably helps make the pesky
cognitive_complexity
Clippy lint show upless often, and improves maintainability for the macros as well.