API for exposing query analysis #3276

aleks-p · 2024-05-03T21:39:23Z

Exposes a new API that provides insights into the amount of data in the queried time range, as well as the number of unique series reached via the label selector.

Example output (note that numbers are serialized as strings):

{
  "queryScopes": [
    {
      "componentType": "Short term storage",
      "componentCount": "15",
      "numBlocks": "169",
      "numSeries": "1423610",
      "numProfiles": "31250250",
      "numSamples": "2691866373",
      "indexBytes": "429009665",
      "profileBytes": "13782371141",
      "symbolBytes": "10660962037"
    },
    {
      "componentType": "Long term storage",
      "componentCount": "2",
      "numBlocks": "20",
      "numSeries": "845298",
      "numProfiles": "115831090",
      "numSamples": "10271806846",
      "indexBytes": "246238688",
      "profileBytes": "60527297722",
      "symbolBytes": "9774455978"
    }
  ],
  "queryImpact": {
    "totalBytesInTimeRange": "95420335231",
    "totalQueriedSeries": "436",
    "deduplicationNeeded": true
  }
}

To be extended in the future with an estimate of the query execution time and other statistics.

Closes #3001

kolesnikovae

Great job! 🎉

I believe this a very good starting point of making the read path more transparent. Ideally, we could also collect the actual execution statistics and report them along the way with the query results (like EXPLAIN ANALYZE in SQL).

Querying series might be fairly expensive in some cases (e.g., if there are high cardinality labels in the data set), therefore we should be careful calling the analysis. Also, Series API reports matching series in the block not accounting for the time range, which should not pose an issue in vast majority of cases – just a clarification: for example, you may query 15 minutes and get no data, and the analysis will tell you that there are 5 matching series

This reverts commit 649c2cce206e95d01ab0a80b6f15450d96b38d97.

aleks-p · 2024-05-06T14:52:18Z

Thanks @kolesnikovae.

Ideally, we could also collect the actual execution statistics and report them along the way with the query results (like EXPLAIN ANALYZE in SQL).

Agreed. The purpose of this first iteration is to provide an efficient endpoint that could be used before the actual query, serving as a sanity check for the query itself.

Series API reports matching series in the block not accounting for the time range

I didn't know Series doesn't respect the start/end (aside from validity checks), so indeed this can result in some inconsistencies. Thanks for the heads up on that, we'll need to take the numbers with a grain of salt for now then.

Querying series might be fairly expensive in some cases (e.g., if there are high cardinality labels in the data set), therefore we should be careful calling the analysis. Also,

I added tenant-level overrides for now, one for the entire endpoint (query_analysis_enabled, defaults to true), and one for the series portion of it (query_analysis_series_enabled, defaults to false).

aleks-p requested a review from a team as a code owner May 3, 2024 21:39

kolesnikovae approved these changes May 6, 2024

View reviewed changes

aleks-p added 24 commits May 6, 2024 11:20

Add API signature

f4a36e1

Implement AnalyzeQuery (wip)

b49e7b0

Implement AnalyzeQuery, part 2 (wip)

33169dc

Fix bug

160083c

Fix bug

6238b14

Reorder query scopes

832e79e

Add queries series stat

8dd77e2

Add queried series stat (take 2)

67d4062

Improve naming, query matchers handling

ffa1882

Simplify querier logic, add deduplication flag

8e6016e

Add basic validation

7efe587

Add basic validation (fix compile error)

275c340

Only return query scopes that are in use

9b629f9

Revert "Only return query scopes that are in use"

862c67e

This reverts commit 649c2cce206e95d01ab0a80b6f15450d96b38d97.

Improve readability, add unit tests, fix broken unit tests

ad8f968

Remove unused fields in proto spec

0a4eb34

Revert undesired change

690c16e

Improve method name

67e1017

Revert undesired change

a4e408f

Improve naming consistency

7769249

Remove unused function

1918b54

Speed up reference-help target

3ba9520

Add per-tenant overrides for query analysis

c86ff18

Change default config value for query_analysis_series_enabled

0deea6e

aleks-p force-pushed the feat/explain-query branch from 4983a17 to 0deea6e Compare May 6, 2024 14:20

aleks-p requested a review from a team as a code owner May 6, 2024 14:20

Update reference help

13ec9fe

aleks-p requested a review from korniltsev as a code owner May 6, 2024 14:21

aleks-p added 2 commits May 6, 2024 12:22

Improve naming consistency, fix flaky test

0fe77c8

Revert change to ebpf/testdata

33e3926

aleks-p merged commit f4f2c43 into main May 6, 2024
16 checks passed

aleks-p deleted the feat/explain-query branch May 6, 2024 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API for exposing query analysis #3276

API for exposing query analysis #3276

aleks-p commented May 3, 2024

kolesnikovae left a comment

aleks-p commented May 6, 2024

API for exposing query analysis #3276

API for exposing query analysis #3276

Conversation

aleks-p commented May 3, 2024

kolesnikovae left a comment

Choose a reason for hiding this comment

aleks-p commented May 6, 2024