Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a time series aggregation to tsdb track #348

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions tsdb/challenges/default.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,20 @@
"description": "Indexes the whole document corpus.",
"default": true,
"schedule": [
{
"name":"increase-max_buckets_setting",
"tags": ["setup"],
"operation": {
"operation-type": "raw-request",
"method": "PUT",
"path": "/_cluster/settings",
"body": {
"transient": {
"search.max_buckets" : 300000
}
}
}
},
{%- if ingest_mode is defined and ingest_mode == "data_stream" %}
{
"name": "put-timestamp-pipeline",
Expand Down Expand Up @@ -123,6 +137,11 @@
"operation": "date-histo-entire-range",
"warmup-iterations": 50,
"iterations": 100
},
{
"operation": "date-histo-with-time-series-1h",
"warmup-iterations": 50,
"iterations": 100
}
]
},
Expand Down
45 changes: 44 additions & 1 deletion tsdb/operations/default.json
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,11 @@
{
"name": "date-histo-entire-range",
"operation-type": "search",
{%- if ingest_mode is defined and ingest_mode == "data_stream" %}
"index": "k8s",
{%- else %}
"index": "tsdb",
{%- endif %}
"body": {
"size": 0,
"aggs": {
Expand All @@ -86,4 +90,43 @@
}
}
}
}
},
{
"name": "date-histo-with-time-series-1h",
"operation-type": "search",
{%- if ingest_mode is defined and ingest_mode == "data_stream" %}
"index": "k8s",
{%- else %}
"index": "tsdb",
{%- endif %}
"body": {
"size": 0,
"aggs": {
"by_timestamp": {
"date_histogram": {
"field": "@timestamp",
"fixed_interval": "1h"
},
"aggs": {
"ts": {
"time_series": {
"keyed": false
},
"aggs": {
"available_memory": {
"min": {
"field": "kubernetes.node.memory.available.bytes"
}
}
}
},
"min_available_memory": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why we are also including this pipeline aggregation. Do we expect to see changes in performance if we introduce some optimization for time_series aggregator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this because I think a pipeline aggregator is part of the usage pattern with time series aggregation.
All the time series buckets is an information overflow, getting min, max or derivative from all time series buckets will be useful informations to plot in a graph.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do you think we should just focus on the time series agg itself in the benchmark? So removing the pipeline agg and maybe even the date histogram here? So it is just the time_series agg?

"min_bucket": {
"buckets_path": "ts>available_memory"
}
}
}
}
}
}
}