Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for overlapping "buckets" in the date histogram #66856

Closed
tobiasstadler opened this issue Dec 29, 2020 · 11 comments
Closed

Support for overlapping "buckets" in the date histogram #66856

tobiasstadler opened this issue Dec 29, 2020 · 11 comments
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@tobiasstadler
Copy link
Contributor

I would like create overlapping "buckets" in the date histogram aggregation(or a new aggregation). E.g I would like to create buckets for every hour of the last 12 hours, but each bucket should also contain the documents 3 hours prior the bucket.

bucket 1 should contain everything between now and now-4h,
bucket 2 should contain everything between now-1h and now-5h,
bucket 3 should contain everything between now-2h and now-6h,
...

It should then be possible to calculate a metric for each bucket. E.g. I should be able to calculate the average of the last 3 hours for each hour.

This is similar to what one can do with Prometheus range queries (https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries)

@tobiasstadler tobiasstadler added >enhancement needs:triage Requires assignment of a team area label labels Dec 29, 2020
@tobiasstadler
Copy link
Contributor Author

I know I can use date_range aggregation by manual specifying the ranges, but I was hoping for more automatic range creation.

@nik9000 nik9000 added the :Analytics/Aggregations Aggregations label Dec 29, 2020
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Dec 29, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@nik9000
Copy link
Member

nik9000 commented Dec 29, 2020

I've labeled this for triage by aggs folks.

I know I can use date_range aggregation by manual specifying the ranges, but I was hoping for more automatic range creation.

For what it is worth, come 7.11 Elasticsearch internally rewrites date_histogram into a date_range aggregation in lots of cases (#63643). So, at least in 7.11, your work around isn't going to execute any slower.

@jimczi jimczi removed the needs:triage Requires assignment of a team area label label Jan 12, 2021
@bradyasana
Copy link

Going to pile on with a similar feature request: it'd be great if you could accumulate the documents in the buckets of a date_histogram aggregation, such that bucket 2 would contain all documents from bucket 1, etc. While you can accumulate inner metrics across date_histogram buckets, you can't accumulate the documents themselves.

This would be particularly beneficial if you could access the bucket keys in the sub-aggregations (#56392).

@imotov
Copy link
Contributor

imotov commented Oct 11, 2021

We are going to address that as a part of #74660 (sliding window aggregation). We can discuss if we want to make it applicable to non-TSDB indices or not.

@tobiasstadler
Copy link
Contributor Author

My use case only involves time series data, so I am fine with TSDB indices (only).

@tobiasstadler
Copy link
Contributor Author

tobiasstadler commented Oct 11, 2021

@imotov Is there any timeline when sliding window aggregation will be available?

@imotov
Copy link
Contributor

imotov commented Oct 11, 2021

@tobiasstadler we are actively working on it and there is an internal timeline for it. Unfortunately, I cannot share it externally. I can only suggest watching the public issue #74660 to see how this work is progressing.

@tobiasstadler
Copy link
Contributor Author

I am looking forward for it

@wchaparro
Copy link
Member

Based on our recent internal discussion on this, we plan on introducing this as a sliding window aggregation focused on TSDB indicies.

We also had a discussion to see how we might introduce this as a sliding window aggregation for non-TSDB indicies as a general use case ( i.e. show me logs where we had 404 errors, over a 3 hour time window, and then show it to me for the next hour). We would introduce this as a separate distinct aggregation, supporting a fixed time interval.

@wchaparro
Copy link
Member

Closing per prior comment.

@wchaparro wchaparro closed this as not planned Won't fix, can't repro, duplicate, stale Mar 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

7 participants