-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: avoid scoped dispatch overhead when scoped dispatchers are not in use #861
Labels
area/no-std
Related to improving support for no-std.
crate/core
Related to the `tracing-core` crate
kind/perf
Performance improvements
Comments
hawkw
added
crate/core
Related to the `tracing-core` crate
kind/perf
Performance improvements
labels
Jul 29, 2020
The overhead of adding an enum to dispatchers turns out to be more noticeable than one would have thought! |
hawkw
added a commit
that referenced
this issue
Oct 7, 2020
## Motivation Presently, `tracing` and `tracing-core` allow disabling the Rust standard library. However, even when used with the standard library disabled, they still require the use of `alloc`, which means some form of global allocator must exist. This can be a significant limitation on some very constrained embedded devices. Additionally, the use of any heap allocations on these devices ought to be avoided. Now that the callsite registry is no longer a `Vec`, we don't need heap allocations for storing registered callsites. However, `Dispatch`s are currently _always_ stored in `Arc`s, and the global registry always contains a `Vec` of `Weak` refs to them. This is required in order to support thread-local scoped dispatcher API, where multiple `Dispatch`es can coexist. However, on `no-std` platforms without threads, we already don't support the thread-local scoped dispatch API. In this case, we can reduce the complexity and the number of allocations required by the dispatcher. Additionally, even when the standard library is in use, if the dispatcher is a global default rather than a scoped dispatcher, it remains set forever. This means it can never be dropped. When this is the case, it's possible to clone the dispatcher by cloning a `'static` reference, rather than a (comparatively) costly `Arc` bump. ## Solution This branch changes the global default dispatcher to be represented as a `&'static dyn Subscriber` reference on all platforms. In addition, the dependency on `liballoc` is now feature flagged, allowing it to be disabled along with the dependency on `libstd`. This means that `tracing` can now be used with _no_ heap allocations required. In order to use `tracing` without `liballoc`, `Dispatch`es can only be created from `&'static dyn Subscriber` references, since this is the only way to erase their types that exists without boxed trait objects. Therefore, I've added a new `Dispatch::from_static` constructor that is always available. This is more limiting than `Dispatch::new` --- some ways of modeling runtime composition of the subscriber are much harder (although it could still contain state). However, users without `alloc` are probably fairly used to this --- the `log` crate [has the same limitation](https://docs.rs/log/0.4.11/log/#use-with-std) when used without `std`. Also, I've changed the global subscriber registry when `std` _is_ enabled to use a `RwLock` rather than a mutex. Now, multiple callsites can be registered simultaneously, and a write lock is only necessary to add a new subscriber or deregister old ones. It's now fine to use `RwLock` here, because we no longer use any locking at all when `std` is not enabled --- we always use the single global default subscriber, so we don't require a list of currently live subscribers. I've modified the `tracing` and `tracing-core` crates to add new feature flags for enabling `std` and `alloc` separately. Closes #861 Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/no-std
Related to improving support for no-std.
crate/core
Related to the `tracing-core` crate
kind/perf
Performance improvements
Currently, the global default dispatcher is essentially implemented as a layer on top of the scoped default dispatcher. This means that using the global default dispatcher still pays some of the overhead of the scoped default dispatcher. In particular:
Atomic reference counting. The global default dispatcher will never be dropped, as it lives for the entire lifetime of the program. However, we still
Arc
it, since the scoped default dispatcher requiresArc
ing. This means we have to bump a reference count to clone it that will never be 0. We could potentially change the internal representation to an enum of eitherArc<dyn Subscriber + ...>
(when scoped) or&'static dyn Subscriber + ...
when global. That way, cloning the global dispatch is essentially free. This also means we could feature flag theArc
use, moving us one step closer to working onno_std
targets without liballoc.Currently, we always check the thread-local storage for the scoped dispatcher before checking the global default. This is because a scoped default takes priority over a global default. However, this also means that getting the dispatch always involves a TLS hit, even when the TLS dispatcher is not being used. As I suggested here:
The additional complexity of this should, of course, be benchmarked.
The text was updated successfully, but these errors were encountered: