Validate that dashboard filters run efficient elasticsearch queries #5003

tommyers-elastic · 2023-01-16T10:15:05Z

Filtering in dashboards can happen in several places, as detailed below. This issue exists to validate that the resulting elasticsearch queries are the most efficient they can be; these queries are opaque to the user, as well as package devs, but can be inspected in the UI.

For sections 1-3 below, the query structure is the same:

query": {
    "bool": {
      "filter": [
        {
          "match_phrase": {
            "cloud.account.id": "elastic-obs-integrations-dev"
          }
        }
      ],
      ...

For section 4, the query structure is slightly different:

  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must": [],
            "filter": [
              {
                "bool": {
                  "should": [
                    {
                      "match_phrase": {
                        "agent.type": "metricbeat"
                      }
                    }
                  ],
                  "minimum_should_match": 1
                }
              }
            ],
           ...

Dashboard-level controls panels

2. Dashboard-level query filters

Panel-level query filters

Panel-level options filter (for non-lens visualisations)

In addition, here is an example of a full request for a panel generating CPU metrics, filtered by datastream.

Show

{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "@timestamp": {
              "gte": "2023-01-16T00:00:00.000Z",
              "lte": "2023-01-16T23:59:59.999Z",
              "format": "strict_date_optional_time"
            }
          }
        }
      ],
      "filter": [
        {
          "match_phrase": {
            "data_stream.dataset": "gcp.compute"
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  },
  "aggs": {
    "61ca57f1-469d-11e7-af02-69e470af7417": {
      "terms": {
        "field": "cloud.instance.name",
        "order": {
          "61ca57f2-469d-11e7-af02-69e470af7417-SORT": "desc"
        }
      },
      "aggs": {
        "61ca57f2-469d-11e7-af02-69e470af7417-SORT": {
          "avg": {
            "field": "gcp.compute.instance.cpu.usage.pct"
          }
        },
        "timeseries": {
          "date_histogram": {
            "field": "@timestamp",
            "min_doc_count": 0,
            "time_zone": "Europe/London",
            "extended_bounds": {
              "min": 1673827200000,
              "max": 1673913599999
            },
            "fixed_interval": "5m"
          },
          "aggs": {
            "61ca57f2-469d-11e7-af02-69e470af7417": {
              "avg": {
                "field": "gcp.compute.instance.cpu.usage.pct"
              }
            }
          }
        }
      },
      "meta": {
        "timeField": "@timestamp",
        "panelId": "61ca57f0-469d-11e7-af02-69e470af7417",
        "seriesId": "61ca57f1-469d-11e7-af02-69e470af7417",
        "intervalString": "5m",
        "indexPatternString": "metrics-*"
      }
    }
  },
  "runtime_mappings": {}
}

The text was updated successfully, but these errors were encountered:

ruflin · 2023-01-17T13:59:37Z

@jpountz It would be great to get someone from the Elasticsearch team to take a quick look at these queries. The reason I'm bringing this up is that I want to make sure that the queries built by Kibana are internally by Elasticsearch converted to efficient queries using data_stream.dataset pre filtering.

For example there is a trip bool nesting with match_phrase which I would manually not necessarily write like this. Same for match_phrase which I assume should be fine because of elastic/elasticsearch#85165 (thanks for the pointer).

ruflin · 2023-02-20T13:12:45Z

~~@dakrone Could someone in your team take a quick look at this?~~ 🤦 Sorry for the wrong ping

@martijnvg Could someone in your team take a quick look at this?

martijnvg · 2023-02-21T07:59:03Z

I personally wouldn't use the match_phrase to filter on a constant keyword field. But it does what is expected and is able to rewrite to a match all docs or a match no docs query just by looking at the mapping.

The change I would make to this query, is to move the range query from the must clause to the filter clause. I don't think query time boost / scoring is needed here, since requested size is 0. This would make the range query eligible for the query cache and subsequent usages of this range query instance in different search request could make use of a cached result.

ruflin · 2023-02-22T14:00:38Z

@martijnvg As the queries are generated by Lens / Kibana, we don't have control about it. Pulling in @drewdaemon : Does this fit in your area or who should we ping on the Kibana side to get this "reviewed"? @martijnvg Ideally you would work directly with Kibana on this as it will affect all users that build queries in Kibana.

martijnvg · 2023-02-22T14:25:00Z

I like to mention that in case when no scoring is required, it is always better to use the filter clause of a boolean query instead of the must clause.

drewdaemon · 2023-02-22T15:25:47Z

@mattkime, @lukasolson, could one of you weigh in here?

lukasolson · 2023-02-27T19:56:49Z

From the examples shown in the description of the issue, I don't see any queries that are placed in a should (or must) clause that aren't enclosed in a higher level filter clause.

The change I would make to this query, is to move the range query from the must clause to the filter clause.

@martijnvg Which query are you referring to? I don't see an example given where the range query isn't inside a filter clause.

martijnvg · 2023-03-06T08:11:06Z

@lukasolson I was referring to the collapsed search request at the end of the issue description. It has a range query in a must clause which could be moved to the filter clause.

Regarding the range query, how are shorter periods like 15 minutes defined? Are those periods rounded and if so how?

lukasolson · 2023-03-06T16:20:46Z

Ah, that makes sense. I've added that to our meta issue regarding query performance here: elastic/kibana#101041

Right now we aren't doing any rounding with the date ranges. If you select a period of "last 15 minutes" then we will convert that to an absolute time range on the browser (not send it as "now-15" or anything like that). We have an issue related to rounding here: elastic/kibana#94280

ruflin · 2023-03-28T10:55:00Z

@lukasolson The issue elastic/kibana#101041 refers to Discover, but I assume all these optimisations would also be available to visualisations? Because that is where this issue initially comes from.

@martijnvg I plan to close this issue as the goal was to get the discussion started around performance of dashboards. I expect you and the team to keep watching this and potentially pushing forward. Reason is if Elasticsearch makes improvements on the query speed but the benefits are not used in Kibana, it will not be available to Elasticsearch.

tommyers-elastic assigned ruflin Jan 16, 2023

ruflin closed this as completed Mar 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate that dashboard filters run efficient elasticsearch queries #5003

Validate that dashboard filters run efficient elasticsearch queries #5003

tommyers-elastic commented Jan 16, 2023 •

edited

Loading

ruflin commented Jan 17, 2023

ruflin commented Feb 20, 2023 •

edited

Loading

martijnvg commented Feb 21, 2023

ruflin commented Feb 22, 2023

martijnvg commented Feb 22, 2023 •

edited

Loading

drewdaemon commented Feb 22, 2023

lukasolson commented Feb 27, 2023

martijnvg commented Mar 6, 2023

lukasolson commented Mar 6, 2023 •

edited

Loading

ruflin commented Mar 28, 2023

Validate that dashboard filters run efficient elasticsearch queries #5003

Validate that dashboard filters run efficient elasticsearch queries #5003

Comments

tommyers-elastic commented Jan 16, 2023 • edited Loading

ruflin commented Jan 17, 2023

ruflin commented Feb 20, 2023 • edited Loading

martijnvg commented Feb 21, 2023

ruflin commented Feb 22, 2023

martijnvg commented Feb 22, 2023 • edited Loading

drewdaemon commented Feb 22, 2023

lukasolson commented Feb 27, 2023

martijnvg commented Mar 6, 2023

lukasolson commented Mar 6, 2023 • edited Loading

ruflin commented Mar 28, 2023

tommyers-elastic commented Jan 16, 2023 •

edited

Loading

ruflin commented Feb 20, 2023 •

edited

Loading

martijnvg commented Feb 22, 2023 •

edited

Loading

lukasolson commented Mar 6, 2023 •

edited

Loading