-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Lens] Formula: Define level of metric #94789
Comments
Pinging @elastic/kibana-app (Team:KibanaApp) |
I agree that this is a hard concept to explain to users, and I think you may have missed an important case: top level unbucketed metrics. This is similar to the dynamic thresholds discussion we've been having, where I could build a formula to hide any values that are below median. Something like:
Here are some rough ideas for how we can clarify this concept for users. Each idea is separate:
|
Agreed, top level metrics are important as well.
That definitely makes sense and would be consistent with the rest of how formula works, I was just wondering whether it's enough. Maybe I'm overthinking
It's reducing the complexity while also reducing the expressiveness of formula, but maybe that's OK as a first step. Once people start using it we can see in which direction to evolve.
This seems like the most expensive option, but maybe it's the next logical step. I still hope we can avoid doing a mixed UI like this (and implement expression/sql datasources to cover these use cases instead) |
Another important use case for this is array values - in some cases working with them summing up the rows using overall_sum is not the same as executing the metric “outside” of the current aggregation tree because of overlaps: #115770 |
I think this applies - https://discuss.elastic.co/t/how-to-get-the-total-of-memory-and-cpu-usage-of-my-cluster/288170 something like the below could solve what they're looking for (they want the overall sum but don't want to display the field from the group by to get it)
|
It sounds more like #94619 which is closely related - the difference is that in "define level of metric" a single metric is not broken down, but the breakdown is still shown in the chart while in "Collapse bucket column" nothing is broken down (and not shown in the chart), except for the single metric which is summed up for display. |
we plan on accounting for this requirement in our query system (and its ability to query and transform data) - #126095 Closing this issue as formula is not the intended solution |
Right now all metrics specified in a formula are nested in all defined bucket dimensions of the current chart. In some cases however it's useful to work with a metric from a higher level of the aggregation tree. Overall metrics (#94597) can behave similar in some cases, but there are a few differences.
Implementation
For each metric, there could be a parameter of which buckets dimensions to apply (defaults to all of them):
median(bytes, overallFor=reference("Top values geo.src"))
- in a chart over time with a "break down by" dimension of top values ofgeo.src
, this would give the median for eachgeo.src
without applying the date histogram dimension.On the implementation side this would require us to do multiple
esaggs
calls (for each combination of skipped/unskipped bucket aggs), then merge together the resulting table, joining in the higher level metrics.This API would be more flexible than what Elasticsearch offers right now (you can define higher level metrics, but only in the order of the tree structure - this means you can't skip the root bucket agg, but keep the nested one).
Use case
I'm not sure whether we should offer this, because the differences to overall metrics are easy to confuse. For example,
avg(bytes, overallFor=reference("Top values geo.src"))
can also be written asas long as "Top values geo.src" actually fetches all data ("Other" bucket included or size parameter high enough). However, it won't work for all of them (e.g. median)
Concerns
The main concern is stated above - would people be able to understand the difference between overall metrics calculated client side and metrics on different levels calculated on Elasticsearch side?
Also, if the API is implemented like defined here, it would be possible to get other "terms" buckets (because the top 5 terms change if some buckets within the hierarchy are skipped) - this would be a little confusing because for some buckets the overall metric would be missing (there's a similar issue with time offset)
The text was updated successfully, but these errors were encountered: