-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rollup] Why the documents of the rollup contain <field>.<agg_type>._count having the same values? #47876
Comments
Pinging @elastic/es-analytics-geo (:Analytics/Rollup) |
I see that @polyfractal mentioned in #45187 that Rollup 'saves all the metrics "in isolation" without seeing if a different metric provides the same value' and that the situation could be improved. Here's an example where we're seeing the same count mentioned over and over again, and take up almost 50% of rolled-up document size:
|
As an aside, this is something that should be drastically improved in the Rollup V2 refactor we're working on. We're introducing a dedicated "doc_count" field mapper which will store the count for the whole doc instead of duplicating it repeatedly like you see with v1 |
With the 8.7 release of Elasticsearch, we have made a new downsampling capability associated with the new time series datastreams functionality generally available (GA). This capability was in tech preview in ILM since 8.5. Downsampling provides a method to reduce the footprint of your time series data by storing it at reduced granularity. The downsampling process rolls up documents within a fixed time interval into a single summary document. Each summary document includes statistical representations of the original data: the min, max, sum, value_count, and average for each metric. Data stream time series dimensions are stored unchanged. Downsampling is superior to rollup because:
Because of the introduction of this new capability, we are deprecating the rollups functionality, which never left the Tech Preview/Experimental status, in favor of downsampling and thus we are closing this issue. We encourage you to migrate your solution to downsampling and take advantage of the new TSDB functionality. |
Elasticsearch version: 7.4.0 (and previous)
Description of the problem including expected versus actual behavior:
The documents generated by the
Rollup
Job always produce, for eachfield
considered in theterms
, a field named<field>.<agg_type>._count
.The value of such field is always the same.
Why isn't it written just once at the root of the document?
Steps to reproduce:
kiana_sample_data_logs
my_field
Result:
The text was updated successfully, but these errors were encountered: