-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Transform] add support for terms agg in transforms #56696
[Transform] add support for terms agg in transforms #56696
Conversation
Pinging @elastic/ml-core (:ml/Transform) |
Reviewer, on the backport, I will place a check for |
1c8f47c
to
c18238c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! It would be good to cover the nesting case, otherwise LGTM.
@@ -265,12 +265,12 @@ setup: | |||
"group_by": { | |||
"time": {"date_histogram": {"fixed_interval": "1h", "field": "time"}}}, | |||
"aggs": { | |||
"vals": {"terms": {"field":"airline"}} | |||
"vals": {"significant_terms": {"field":"airline"}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄
@@ -246,6 +250,35 @@ public Object value(Aggregation agg, Map<String, String> fieldTypeMap, String lo | |||
} | |||
} | |||
|
|||
static class MultiBucketsAggExtractor implements AggValueExtractor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 this is almost exactly the way I implemented it, too (well, there is probably no other way).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great minds think alike.
if (bucket.getAggregations().iterator().hasNext() == false) { | ||
nested.put(bucket.getKeyAsString(), bucket.getDocCount()); | ||
} else { | ||
HashMap<String, Object> nestedBucketObject = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good to cover this branch and have a test with nested terms aggs, like your common user example, broke down by e.g. businesses or filtered by something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This adds support for `terms` and `rare_terms` aggs in transforms. The default behavior is that the results are collapsed in the following manner: `<AGG_NAME>.<BUCKET_NAME>.<SUBAGGS...>...` Or if no sub aggs exist `<AGG_NAME>.<BUCKET_NAME>.<_doc_count>` The mapping is also defined as `flattened` by default. This is to avoid field explosion while still providing (limited) search and aggregation capabilities.
…56809) * [Transform] add support for terms agg in transforms (#56696) This adds support for `terms` and `rare_terms` aggs in transforms. The default behavior is that the results are collapsed in the following manner: `<AGG_NAME>.<BUCKET_NAME>.<SUBAGGS...>...` Or if no sub aggs exist `<AGG_NAME>.<BUCKET_NAME>.<_doc_count>` The mapping is also defined as `flattened` by default. This is to avoid field explosion while still providing (limited) search and aggregation capabilities.
This adds support for
terms
andrare_terms
aggs in transforms.The default behavior is that the results are collapsed in the following manner:
<AGG_NAME>.<BUCKET_NAME>.<SUBAGGS...>...
Or if no sub aggs exist
<AGG_NAME>.<BUCKET_NAME>.<_doc_count>
The mapping is also defined as
flattened
by default. This is to avoid field explosion while still providing (limited) search and aggregation capabilities.