Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frontend: update query pruning #10026

Merged
merged 5 commits into from
Dec 3, 2024
Merged

frontend: update query pruning #10026

merged 5 commits into from
Dec 3, 2024

Conversation

krajorama
Copy link
Contributor

@krajorama krajorama commented Nov 26, 2024

What this PR does

The current method of excluding/including sub query results in PromQL by comparing to -Inf or +Inf is no longer valid after prometheus/prometheus#15245 due to comparison of native histograms to a float with < or > result in Jeanette's warning, not empty set.

To ease migration away from the old wrong logic, I'm adding the new logic first.

The new method uses logical AND operation to intersect the sub query with either a const vector() or an empty vector(). E.g.

subquery and on() (vector(1)==1)

which becomes:

subquery

and conversely

subquery and on() (vector(-1)==1)

becomes

(vector(-1)==1)

Note

We cannot just drop (vector(-1)==1) altogether , because of this example:

(other_subquery) and (subquery and on() (vector(-1)==1))

removal of the whole right side would allow results from the other_subquery which is not correct. Solving this is out of scope.

Which issue(s) this PR fixes or relates to

Fixes N/A

Checklist

  • Tests updated.
  • N/A Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • N/A about-versioning.md updated with experimental features.

@krajorama krajorama force-pushed the krajo/change-pruneinig branch from 9dcc588 to c70d552 Compare November 26, 2024 10:24
@krajorama krajorama marked this pull request as ready for review November 26, 2024 10:35
@krajorama krajorama requested a review from a team as a code owner November 26, 2024 10:35
@krajorama krajorama marked this pull request as draft November 26, 2024 12:11
@krajorama krajorama force-pushed the krajo/change-pruneinig branch 2 times, most recently from 97a767d to 98d8813 Compare November 26, 2024 12:45
@krajorama krajorama marked this pull request as ready for review November 26, 2024 12:46
Copy link
Contributor

@charleskorn charleskorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can further prune the expression in some cases:

{
// "and on()" is not on top level, "or" has lower precedence.
`(avg(rate(foo[1m]))) and on() (vector(0) == 1) or avg(rate(bar[1m]))`,
`(vector(0) == 1) or avg(rate(bar[1m]))`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we simplify this further, to avg(rate(bar[1m]))?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's more complicated. The main point is to avoid loading chunks and (vector(0)==1) doesn't load chunks, so should be very efficient.
We can add that logic in a new PR, seems to be independent from this optimization, wdyt?
But having two algorithms raises the question of whether they need to be run repeatedly to optimize away everything, so it gets even more complicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that removing (vector(0)==1) from an OR expression is different from removing it from an AND expression, which is what I mean by more complicated . Outside my scope.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test for this? We should document that in a test so nobody comes tomorrow and "optimizes further".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having two algorithms raises the question of whether they need to be run repeatedly to optimize away everything, so it gets even more complicated.

If we optimise the leaf node expressions first and work our way back up the tree, can't we do both and and or in one pass?

But if you want to limit this to and and come back to or later, that's fine by me, please just make this clear with a test or a comment as Oleg suggests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding both

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, actually I had 3 tests on this already so just improved the comments

{
// "and on()" is not on top level, due to left-right associativity.
`(avg(rate(foo[1m]))) and on() (vector(0) == 1) and avg(rate(bar[1m]))`,
`(vector(0) == 1) and avg(rate(bar[1m]))`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we simplify this further, to vector(0) == 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of scope

},
{ // The const expression is on the wrong side.
`(vector(0) == 1) and on() (avg(rate(foo[1m])))`,
`(vector(0) == 1) and on() (avg(rate(foo[1m])))`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below - I think this can be simplified to vector(0) == 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope, not a useful expression in the first place . Also true is it had vector(1).

},
{ // Matching on labels.
`(avg(rate(foo[1m]))) and on(id) (vector(0) == 1)`,
`(avg(rate(foo[1m]))) and on(id) (vector(0) == 1)`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this be simplified to vector(0) == 1? The right side will never produce a result, so we'll never return anything from the left side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of scope

},
{ // Not "on" expression.
`(avg(rate(foo[1m]))) and ignoring() (vector(0) == 1)`,
`(avg(rate(foo[1m]))) and ignoring() (vector(0) == 1)`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - the right side will never produce a result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of scope

Copy link
Contributor

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good at first glance. (I assume you don't want a code-level review from me, just that the idea is correct.)

@krajorama krajorama force-pushed the krajo/change-pruneinig branch 2 times, most recently from a1440c4 to 6901f7d Compare November 29, 2024 08:55
The current method of excluding/including sub query results in PromQL
by comparing to -Inf or +Inf is no longer valid after
prometheus/prometheus#15245
due to comparison of native histograms to a float with < or > result
in Jeanette's warning, not empty set.

The new method uses logical AND operation to intersect the sub query with
either a const vector() or an empty vector(). E.g.

subquery and on() (vector(1)==1)
subquery and on() (vector(-1)==1)

which become:

subquery
(vector(-1)==1)

Note that although in theory (vector(-1)==1) could be dropped in some
cases, it depends on the context and out of scope for this PR.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
CHANGELOG.md Outdated Show resolved Hide resolved
pkg/frontend/querymiddleware/astmapper/pruning.go Outdated Show resolved Hide resolved
Comment on lines 82 to 90
var lIsConst bool
lIsConst, lValue = pruner.isNumber(lhs)
if !lIsConst {
return false, false
}

return expr
rIsVector, rValue := pruner.isConstVector(rhs)
if !rIsVector {
return false, false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Given this is repeating the same check as the if block above, but with the left and right sides swapped, have you considered extracting this to a method like isVectorAndConstant and calling it twice? (eg. return isVectorAndConstant(lhs, rhs) || isVectorAndConstant(rhs, lhs))

This might make it clearer that we're trying to check for the same thing in either order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated, ptal

pkg/frontend/querymiddleware/prune_test.go Outdated Show resolved Hide resolved
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
@krajorama krajorama merged commit 6ba0346 into main Dec 3, 2024
29 checks passed
@krajorama krajorama deleted the krajo/change-pruneinig branch December 3, 2024 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants