Skip to content
This repository has been archived by the owner on Apr 2, 2024. It is now read-only.

Commit

Permalink
Fix broken pushdown queries
Browse files Browse the repository at this point in the history
Under certain circumstances, pushdown queries using promscale-extension
aggregates fail.

The failure comes from the fact that the start and end parameters passed
to the aggregate don't correctly align with the actual data which is
aggregated over by the query.

A call to the prom_rate aggregate looks something like the following:

SELECT
    prom_rate(start, end, step_size, range, sample_time, sample_value)
FROM a_metric
WHERE a_metric.t >= data_start
  AND a_metric.t <= data_end

If the time span [data_start, data_end] delivers results which are
outside of the time span [start, end], the aggregate function errors.

In general, this bug appeared when an aggregate pushdown function (such
as PromQL rate) is used together with a vector selector, such that the
lookback of the vector selector is larger than that of the aggregate
pushdown.

The fix is to build the SQL query using the correct values for
data_start and data_end.
  • Loading branch information
JamesGuthrie committed Feb 10, 2022
1 parent 9e90a38 commit 2904a7f
Show file tree
Hide file tree
Showing 3 changed files with 61 additions and 2 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ We use the following categories for changes:
### Fixed
- Fix spans with end < start. Start and end are swapped in this case. [#1096]
- Disable push downs which use `offset`, as they are broken [#1129]
- Fix aggregate pushdown evaluation [#1098]

## [0.9.0] - 2022-02-02

Expand Down
54 changes: 52 additions & 2 deletions pkg/pgmodel/querier/query_builder_samples.go
Original file line number Diff line number Diff line change
Expand Up @@ -163,13 +163,63 @@ func buildSingleMetricSamplesQuery(metadata *evalMetadata) (string, []interface{
orderByClause = "ORDER BY series_id, time"
}
}

// On query time ranges in Prometheus:
//
// When we receive a range query the [`start`, `end`] range specifies the
// timespan that we are expected to deliver results for. Similarly, when we
// receive an instant query, the `time` parameter specifies the instant
// that we are expected to deliver a result for.
//
// Depending on the query, we may need to look further back in time than
// the `start` to deliver results from `start` onwards. For instance, a
// range query such as `rate(metric_one[5m])` in the range `(T1, T2)` will
// deliver results between T1 and T2, but it has a 5-minute range on
// `metric_one`. In order to deliver a result for time T1, we need to get
// `metric_one`'s values at (T1 - 5m) and T1. Similarly, an instant query
// for `metric_one` at timestamp T3 requires looking back the lookback time
// (the default is 5 minutes) to find the most recent samples.
// We call this point in time `scan_start`: the point in time at which we
// need to start scanning the underlying data to calculate the result. This
// could look as follows:
//
// scan_start start end
// |-----------|--------------------|
//
// How do the following time ranges relate to scan_start, start, and end:
// - metadata.timeFilter
// - metadata.selectHints.{Start, End}
//
// The timeFilter's time range is determined by `findMinMaxTime` on the
// expression being evaluated, and is the widest span of _all_
// subexpressions (effectively [min(scan_start), end]).
// This means that, for instance, if the following range selector and
// instant selector are combined: rate(metric_one[1m]) / metric_one, the
// timeFilter.start is T1 - 5m (assuming default lookback time of 5m).
// Note: This is problematic because for metric_one[1m] we actually want to
// query over [T1 - 1m, end], not [T1 - 5m, end].
//
// The selectHints' time range is calculated by `getTimeRangesForSelector`,
// which determines the correct `scan_start` for the current expression.

filter := metadata.timeFilter
sh := metadata.selectHints
var start, end string
// selectHints are non-nil when the query was initiated through the `query`
// or `query_range` endpoints. They are nil when the query was initiated
// through the `read` (remote read) endpoint.
if sh != nil {
start, end = toRFC3339Nano(sh.Start), toRFC3339Nano(sh.End)
} else {
start, end = metadata.timeFilter.start, metadata.timeFilter.end
}

finalSQL := fmt.Sprintf(template,
pgx.Identifier{filter.schema, filter.metric}.Sanitize(),
pgx.Identifier{schema.PromDataSeries, filter.seriesTable}.Sanitize(),
strings.Join(cases, " AND "),
filter.start,
filter.end,
start,
end,
strings.Join(selectorClauses, ", "),
strings.Join(selectors, ", "),
orderByClause,
Expand Down
8 changes: 8 additions & 0 deletions pkg/tests/end_to_end_tests/promql_query_endpoint_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2510,6 +2510,14 @@ func TestPromQLQueryEndpoint(t *testing.T) {
name: "delta function over range selector with offset",
query: `delta(metric_2[1m] offset 3m)`,
},
{
name: "pushdown with range smaller than instant vector lookback",
query: `rate(metric_2[1m]) / metric_2`,
},
{
name: "pushdown with range smaller than instant vector lookback and offset",
query: `rate(metric_2[1m]) / metric_2 offset 2m`,
},
}
start := time.Unix(startTime/1000, 0)
end := time.Unix(endTime/1000, 0)
Expand Down

0 comments on commit 2904a7f

Please sign in to comment.