-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multi-field keys to terms aggs #65623
Comments
Pinging @elastic/es-analytics-geo (Team:Analytics) |
In elastic#65623 we are adding a new aggregation that can reuse some of the non-trivial logic of terms aggs such as reduction. This refactoring moves some of this logic into a parent class where it can be reused. Relates to elastic#65623
I think this will be very useful for APM as well. There's a lot of data that usually should only be aggregated together based on a set of fields. E.g., for transaction duration data, we want to aggregate on service.name, service.environment and transaction.type. Or service.name and service.node.name for JVM memory usage. |
Adds a multi_terms aggregation support. The multi terms aggregation works very similarly to the terms aggregation but supports multiple terms. The goal of this PR is to add the basic functionality so it is not optimized at the moment. It will be done in follow up PRs. Closes elastic#65623
How does it relates to #66247 ? I can see why it would be useful to group things by several dimensions but that's only one side of the issue. The real complexity imo is to use this information correctly when computing the average cpu per host for instance, like described in #66247 ? Is it a different problem that requires another solution ? |
We are not trying to tackle #66247 here. As you said, it has a different set of challenges. I think the multi term agg might be used as part of the overall solution, but we will definitely need a different part of the solution that deals with downsampling aspects. As far as I know we didn't discuss #66247 in details yet, but I have a feeling that we might run into the same type of issues we ran in #60619. |
@jimczi FWIW, I'm not aware that we have the same issue in the APM app. We use Re: multi-field keys, we're currently using both nested terms aggregations and composite aggregations to get aggregate results over multiple fields. Both have their downsides (num buckets, sorting, performance, etc). I'm not sure what the performance characteristics of |
Adds a multi_terms aggregation support. The multi terms aggregation works very similarly to the terms aggregation but supports multiple terms. The goal of this PR is to add the basic functionality so it is not optimized at the moment. It will be done in follow up PRs. Closes #65623
Could it be backported to 7.11 ? |
@YohanSciubukgian no, it's too late to add new features to 7.11. This one will go to 7.12. Sorry. |
Adds a multi_terms aggregation support. The multi terms aggregation works very similarly to the terms aggregation but supports multiple terms. The goal of this PR is to add the basic functionality so it is not optimized at the moment. It will be done in follow up PRs. Closes elastic#65623
Adds a multi_terms aggregation support. The multi terms aggregation works very similarly to the terms aggregation but supports multiple terms. The goal of this PR is to add the basic functionality so it is not optimized at the moment. It will be done in follow up PRs. Closes #65623
This PR clarifies when multi_terms aggs should be used instead of composite aggs or nested term aggs. Relates to elastic#65623
This PR clarifies when multi_terms aggs should be used instead of composite aggs or nested term aggs. Relates to #65623
This PR clarifies when multi_terms aggs should be used instead of composite aggs or nested term aggs. Relates to elastic#65623
This PR clarifies when multi_terms aggs should be used instead of composite aggs or nested term aggs. Relates to elastic#65623
There is a need in kibana (see elastic/kibana#77632) to dynamically form a key from multiple ordered fields, similar to composite aggs. The example that kibana team provided is to find top 10 cpu users, where each user is identified by a compound key consisting of datacenter id, host id, and container id. The required result is 3 keys and cpu usage metric with everything sorted by cpu usage descending:
The current solution is really awkward and limiting concatenation of fields using a script. We would like to offer a more streamlined solution for Kibana here.
The text was updated successfully, but these errors were encountered: