-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use a new handle-based API? #271
Comments
Sorry that you've been having trouble using the new handle-based API in 0.18. At a high-level, there's not much that changes from the perspective of the recorder:
This means that while in 0.17, updating a metric typically meant using the Let's pretend that the In 0.18, If you have a specific example of trying to update one area of your code from metrics 0.17 to 0.18 that you can share, I might be able to provide more help. Looking at the code you linked, it's just not clear to me why it couldn't be adapted to 0.18. |
Thanks for the fast response! Let's focus on the problem of exposing several metrics from one place.
Now, this is implemented by reading rare changing the atomic configuration and calling Let's try to use the new API: struct Storage(Registry);
impl Recorder for Storage {
fn register_counter(&self) -> Counter {
// I cannot just get one metric by calling
// `get_or_create_counter`, because:
// 1. Additional labels should be added, that means
// recreation of `Key`, that's expensive.
// 2. If configured, I need create two counters instead of one
// and it can be changed at runtime.
// 3. At runtime, if configuration is changed, the registration should be called again,
// that's impossible for users who don't use macros.
}
} (3) could be resolved with some analogue of https://docs.rs/tracing-core/latest/tracing_core/callsite/fn.rebuild_interest_cache.html Ok, let's create a special struct MyCounter {
storage: Arc<Storage>,
key: Key,
}
impl CounterFn for MyCounter {
fn increment(&self, value: u64) {
if telemeter_per_actor() {
self.storage.counter_per_actor(self.key, value);
}
if telemeter_per_group() {
self.storage.counter_per_group(self.key, value);
}
}
} The signature of Also, Another, but related, the problem is reusing |
Answering out of order:
That's a good point about
Thanks for this example. It's very clear now what you're trying to achieve and why the new handle-based design makes it harder/inefficient. One thought in my mind: do actors ever change their labels once active? For example, do The reason I ask this question is that I wonder if you might potentially benefit from the new handle-based design by changing how you create metrics for a given actor. If we assume that the metric key/labels are fixed from when an actor starts until when it completes/dies, then it might be worth trying to actually store those metrics as fields on a struct rather than using the macros. By doing so, you could avoid all the current logic that figures out if a metric needs to be updated or not. For example: // All metrics for an actor. This could also be implemented multiple times if
// you wanted to break down the metric groups so that only certain metrics are
// declared near where they are used, closer to the model of only using the macros.
struct ActorMetrics {
messages_received: Counter,
...
}
impl ActorMetrics {
pub fn create() -> Self {
Self {
messages_received: register_counter!("elfo_actor_messages_received"),
}
}
}
// Now you would make these changes in your `Recorder` to support returning a handle
// that can affect multiple metrics at the same time:
impl Recorder {
fn with_params<R>(&self, f: impl Fn(&Storage, &Scope, bool) -> R) -> Option<Vec<R>> {
scope::try_with(|scope| {
let result = Vec::new();
let perm = scope.permissions();
if perm.is_telemetry_per_actor_group_enabled() {
result.push(f(&self.storage, scope, false));
}
if perm.is_telemetry_per_actor_key_enabled() && !scope.meta().key.is_empty() {
result.push(f(&self.storage, scope, true));
}
result
});
}
}
impl metrics::Recorder for Recorder {
fn register_counter(&self, key: &Key) -> Counter {
// `Storage` still controls and owns all data for all registered metrics, but now
// you return individual `Counter` handles for each metric, and then wrap them in
// a newtype that allows for returning a single `Counter` here in `register_counter`.
let maybe_counters = self.with_params(|storage, scope, with_actor_key| {
storage.register_counter(scope, key, with_actor_key)
});
maybe_counters.map(CounterGroup::wrap).unwrap_or_else(|| Counter::noop())
}
...
}
struct CounterGroup {
// Possible optimization is that you just return a direct `Arc<Atomic...>` clone
// from `Storage::register_*` methods, since `Recorder::with_params` is generic
// over return type, which would be slightly more efficient since there would be
// no optional check like there is in `Counter` itself: the `Vec` having items, or
// not, is what now dictates whether we update any metrics.
counters: Vec<Counter>,
}
impl CounterGroup {
fn wrap(counters: Vec<Counter>) -> Counter {
let grouped = CounterGroup { counters };
Counter::from_arc(grouped)
}
}
impl CounterFn for CounterGroup {
fn increment(&self, value: u64) {
for counter in &self.counters {
counter.increment(value);
}
}
...
} While this code is not perfect -- there might be a restriction I'm not thinking of -- I think it could represent a workable design for your use case. Some benefits:
To your point about runtime changes, that is definitely trickier to solve... but it might be solvable in a generic way by using something like a configuration epoch? For example:
As I was typing that, I had the thought that it already wouldn't 100% work because you would need mutability to replace the existing counters in the proposed An atomic load for every metric emit isn't great, but overall, I believe my proposed design would still be faster overall as there would be less centralized access to the |
The main disadvantage of the suggested approach is that the metric set is unbound: all metrics produced by user-written code should use the same approach, not only provided by Maybe
As I said above, metrics produced by users should behave similarly.
Yep, it's problematic. It's a reason, why
As I see, it's a sharded map internally. Of course, even with precalculated hash, it's still slower than atomic loads. I'm sure, that the handle API + |
There's no reason you can't limit it to metrics that use a prefix of
This is an explicit goal of
I think an important distinction is that "registration" in 0.18 is no different than actually "doing" the specific metric operation in 0.17. You never know if a metric exists yet, so the registry always has to look up the key, see if it exists, clone the key and create the storage if it didn't already exist, etc. The primary difference in 0.18 is that we've separated looking up/registering a metric and actually performing the operation that updates it.
This line is a good point, and it actually made me realize that the
I had already mentioned above the subtlety around the changed naming, but I want to dig in a little more on the actual overhead/performance. So, in general, the "registering" step is not that dissimilar to the code that existed before. For a call of In 0.17:
In 0.18:
In both cases, having to look up the key in the registry's sharded map is the most expensive part by far. It outweighs the cost of cloning the The explicit trade-off I made in 0.18 was that the performance for "non-optimal" usage -- calling the macros only, maybe with dynamic labels, etc -- becoming slightly worse was worth it for the improvement in the scenarios where applications can pre-register metrics. In practical terms, the overhead is still very small. If we look at If you're pushing that many metrics per second, or even 10% of that amount, I would argue that you likely understand the performance well enough to tweak how/when you emit metrics to avoid incurring undesired overhead. The flipside of all of this is what 0.18 unlocks: the ability to hyperoptimize metrics overhead by using handles directly. While the ~30-35M calls/second for a single producer was the best you could do in 0.17, even using entirely static keys, using metric handles directly in 0.18 blows it away with ~110M calls/second. Even with a producer parallelism of 16, it's still at ~35M calls/second, over 2x what can be done when only using the macros. |
You're considering the usual usage of if registration only happened once (like in However, implementing #273 also can help in this case.
A good direction, it can solve my problem. At least,
This is a much smaller problem than false sharing.
Without any complex keys. In practical terms, most metrics contain labels, often dynamic. And in 0.18 I'm forced to clone it on every call.
I don't argue with you about the pros of a separate registration step. I'm just trying to show that the current API is not complete enough. Either need to somehow cache a handle per macro call, either provide |
Just to take a step back... My assertion here is that between 0.17 and 0.18, there wasn't a meaningful change to performance/efficiency for callers who just used
Assuming I make the changes to |
Also, separately:
You're right, and this was an oversight on my part. I opened #274 to fix that. |
I just cut a new release --
I think these two things addressed the bulk of your original issues using the new handle-based design, so I'd be interested in hearing any feedback you have if you have a chance to try them out. |
I'm going to update |
Sounds good. 👍🏻 I'm also working on refactoring |
A half year ago, I made the storage implementation TLS-based to avoid contention if multiple actors share the same telemetry key. Yep, contention on a simple atomic counter, which I observed in my production. It seems complicated to support TLS-based storage adequately with the current approach because Which interface would cover such usage? We have two usage patterns:
Both patterns are valid and can be used in different situations. metrics <=v0.17 was focused on the second pattern and >v0.17 on the first one. So, I think the crate should allow redefining in-place (the second pattern) directly by implementing The first pattern is also tricky with TLS-based storages: if the current thread is changed, we should re-acquire the handle: // Just example to illustrate
impl Recorder for MyTlsRecorder {
fn register_counter(&self, key: &Key, metadata: &Metadata<'_>) -> Counter {
let entry: &Box<u64> = self.threadwise.get_or_create(key, metadata);
Counter::from_arc(GuardedCounter {
thread_id: current_thread_id(),
entry, // stable address
})
}
}
struct GuardedCounter {
thread_id: ThreadId,
entry: *const u64,
}
impl CounterFn for GuardedCounter {
fn increment(&self) {
if current_thread_id() == self.thread_id {
unsafe { *self.entry } += 1;
} else {
// Re-register the counter in the current thread
}
}
}
|
Just wanted to say I saw this and I do plan on responding, but there's a lot here and life is a little busy for me at the moment so I need to find some time to digest everything you've laid out here so I can give a proper response. 😅 |
Yes, indeed, we're having a sluggish conversation here) |
After reading your post, I think I understand now what you ended up having to do for your use case and why it's hard to achieve in I don't love the idea of adding new methods to
Maybe I'm missing an obvious way to keep the handle types as-is (i.e. keep the existing pattern of Would it be helpful if we could somehow relax the requirements to allow for, say, |
I'm writing an actor system (elfo) and have been faced with inconvenient usage of the handle API that has been lent in the last metrics release (0.18).
Some prelude before questions.
All metrics exposed by actors are recorded with special labels (
actor_group
and, if configured,actor_key
).Let's consider some built-in metrics, e.g. the
elfo_handling_message_time_seconds
histogram, that's measured for every incoming message:This metric is built using a static name and static label set (that generated for each
Message
) on the fly without registering any metrics before. Also, these labels used somewhere in production code, without explicit storing of any handles.Moreover, every metric can be produced twice (per actor group and per actor), if configured. Because if a group has low actors, it's useful to collect metrics about each of them. And, always, all metrics are collected for whole group.
All emitted metrics are handled by some dedicated actor, which implementation can be altered by user, but default implementation (elfo-telemeter) also provided and aren't based on
metrics-exporter-prometheus
, because I want more control: compact atomic buckets by interval, provide sliding window, have graceful termination, handle zero counters in a special way, provide API for other actors to get metrics in backend-agnostic structure and so on. By the way, I don't understand, why a such specific implementation asmetrics-exporter-prometheus
is stored in this repo at all, but it another question, I need only frontend and useful utils (thanks for them).Now, using 0.17, the metric storage totally implemented in that dedicated actor.
In 0.18, with handle-based API, I don't totally understand how to handle these cases. I need to register all metrics (both, for built-in metrics and specific production metrics), that can have different actor keys (that can be solved with the layer system) and per-message metrics, that requires efficient storage in different places.
And, more important, any metric can be recorded twice (per group and per actor), if configured for a specific actor group.
Thus, I have two problems:
1.Macro calls can lead to exposing one or two metrics (depends on config) with different labels. I see only adding labels in layers.
2. Even in places, where key and labels are static, I cannot emit metric directly and must register all of them, that leads to placing storages here and there (per message, per log level, per dumping class and so on). Should I use
Registry
everywhere?The text was updated successfully, but these errors were encountered: