Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add on and ignoring clauses in binOpExpr #4391

Merged
merged 6 commits into from
Oct 5, 2021

Conversation

garrettlish
Copy link
Contributor

@garrettlish garrettlish commented Sep 28, 2021

What this PR does / why we need it:
Add on and ignoring clauses in binOpExpr, the on and ignoring modifier allows binary operation against different labelset.

Which issue(s) this PR fixes:
Fixes TODO TODO(owen-d): add (on,ignoring) clauses to binOpExpr

Special notes for your reviewer:

Checklist

  • Documentation added
  • Tests updated

@garrettlish garrettlish requested a review from a team as a code owner September 28, 2021 12:03
@CLAassistant
Copy link

CLAassistant commented Sep 28, 2021

CLA assistant check
All committers have signed the CLA.

@garrettlish
Copy link
Contributor Author

@owen-d this PR is target to resolve the TODO item - TODO(owen-d): add (on,ignoring) clauses to binOpExpr, it could be start of the left/right join support in LogQL as well. Any thoughts, please advise. Thanks!

@cyriltovena
Copy link
Contributor

@owen-d let's make sure querysharding is happy with this.

@@ -675,7 +675,11 @@ func TestMapping(t *testing.T) {
{
in: `1 + sum by (cluster) (rate({foo="bar"}[5m]))`,
expr: &BinOpExpr{
op: OpTypeAdd,
op: OpTypeAdd,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need example with this new operator. The binary operation should never be shardable so I think we're fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @cyriltovena for your comments, make sense, added example for new operator in shardmapper unit test - 9cb1a81

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's currrently implemented like this:

func (e *BinOpExpr) Shardable() bool {
	return shardableOps[e.op] && e.SampleExpr.Shardable() && e.RHS.Shardable()
}

However, I think we should prohibit sharding when we're changing the label groupings such as in on or without as it effectively changes the labels and thus invalidates correctness guarantees in sharded aggregation. @garrettlish can you make the change to the Shardable method to return false if either on or without are specified?

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective the addition looks good. It would be great to have a doc addition though.

I let @owen-d give the final word.

@owen-d
Copy link
Member

owen-d commented Sep 29, 2021

Awesome, I'll give this a review soon :)

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @garrettlish, this looks great. Thanks for taking care of my todo for me 🎉 . I've left a few comments I'd like addressed, but then I'm happy to merge it. Really great job uncovering the complexity in here and navigating it well. I expect after this change, we'll need to implement group_{left,right} soon 😁 .

pair := pairs[hash]
if pair[i] != nil {
failStepEvaluator(eval, errors.New("multiple matches for labels"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, was looking for this. It would be great to include a test case that hits this condition. However, instead of trying to mutate the underlying Evaluator by type switching, can we instead use a scoped error that the is set here and checked in the func() error part of the newStepEvaluator (https://github.com/grafana/loki/blob/main/pkg/logql/evaluator.go#L604)?

@@ -1017,3 +1033,17 @@ func absentLabels(expr SampleExpr) labels.Labels {
}
return m
}

// failStepEvaluator marks the step evaluator as failed
func failStepEvaluator(se StepEvaluator, err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to mutate via type switching here, both because mutation can make for a trickier codebase and because type switching is hard to extend well (for instance, if we add newer StepEvaluator implementations).

pkg/logql/expr.y Outdated
@@ -33,7 +33,8 @@ import (
str string
duration time.Duration
LiteralExpr *LiteralExpr
BinOpModifier BinOpOptions
BinOpModifier *BinOpOptions
BoolModifier *BinOpOptions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting :)

pkg/logql/expr.y Outdated
| boolModifier ON OPEN_PARENTHESIS CLOSE_PARENTHESIS
{
$$ = $1
$$.VectorMatching = &VectorMatching{On: true, Include: []string{}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I suspect this can be nil here

Suggested change
$$.VectorMatching = &VectorMatching{On: true, Include: []string{}}
$$.VectorMatching = &VectorMatching{On: true, Include: nil}

pkg/logql/expr.y Outdated
| boolModifier IGNORING OPEN_PARENTHESIS CLOSE_PARENTHESIS
{
$$ = $1
$$.VectorMatching = &VectorMatching{On: false, Include: []string{}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I suspect this can be nil here

Suggested change
$$.VectorMatching = &VectorMatching{On: false, Include: []string{}}
$$.VectorMatching = &VectorMatching{On: false, Include: nil}

@@ -675,7 +675,11 @@ func TestMapping(t *testing.T) {
{
in: `1 + sum by (cluster) (rate({foo="bar"}[5m]))`,
expr: &BinOpExpr{
op: OpTypeAdd,
op: OpTypeAdd,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's currrently implemented like this:

func (e *BinOpExpr) Shardable() bool {
	return shardableOps[e.op] && e.SampleExpr.Shardable() && e.RHS.Shardable()
}

However, I think we should prohibit sharding when we're changing the label groupings such as in on or without as it effectively changes the labels and thus invalidates correctness guarantees in sharded aggregation. @garrettlish can you make the change to the Shardable method to return false if either on or without are specified?

@garrettlish
Copy link
Contributor Author

@owen-d thanks for your good comments, resolved all of the comments via f36957c

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! One little request then I think we're ready to merge :)

Thanks for the great PR!

@@ -998,6 +998,10 @@ func (e *BinOpExpr) String() string {

// impl SampleExpr
func (e *BinOpExpr) Shardable() bool {
if e.opts != nil && e.opts.VectorMatching != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It'd be great to have a sharding test to ensure this

Suggested change
if e.opts != nil && e.opts.VectorMatching != nil {
// `without ()` is the zero value for VectorMatching and is also equivalent
// to not specifying a VectorMatching at all and thus can
// technically pass to the next expression for sharding concerns.
var zero VectorMatching
if e.opts != nil && e.opts.VectorMatching != nil && *e.opts.VectorMatching != zero {

Copy link
Contributor Author

@garrettlish garrettlish Oct 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@owen-d
We cannot use *e.opts.VectorMatching != zero to check whether VectorMatching struct is empty, it is because VectorMatching has Include []string, it will cause Invalid operation: *e.opts.VectorMatching != zero (the operator != is not defined on VectorMatching).
I thought e.opts != nil && e.opts.VectorMatching != nil should be good enough since e.opts.VectorMatching is a pointer? Am I missing anything? Any thoughts, please advise! Thanks!

type BinOpOptions struct {
	ReturnBool     bool
	VectorMatching *VectorMatching
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added sharding test - 0ab8264 and https://github.com/garrettlish/loki/blob/onOrIgnoring/pkg/logql/shardmapper_test.go#L984, could you please help review it again? Thanks!

Copy link
Member

@owen-d owen-d Oct 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah bummer, slice comparison 😭 . Anyway, e.opts != nil && e.opts.VectorMatching != nil should be fine. I was mainly protecting us from preventing sharding when the expression used without (), which is the same as not specifying without at all. I don't think that should block this PR though. Great work, LGTM!

Copy link
Contributor

@KMiller-Grafana KMiller-Grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't get the git suggestion to take my entire suggested reorg of just the docs part of this PR. It wants to reformat some of the suggestion as comment (instead of suggestion).

I'd like you to implement this rewrite. I'd like you to then further add a sentence or two just above each of the examples describing what the example demonstrates. Without that, doc readers really only see an example of the syntax.

@KMiller-Grafana
Copy link
Contributor

My rewrite of the docs addition for file docs/sources/logql/_index.md:

Keywords on and ignoring

The ignoring keyword causes specified labels to be ignored during matching.
The syntax:

<vector expr> <bin-op> ignoring(<labels>) <vector expr>

Example:

sum by(machine) (count_over_time({app="foo"}[1m])) > bool ignoring(machine) sum(count_over_time({app="bar"}[1m]))

The on keyword reduces the set of considered labels to a specified list.
The syntax:

<vector expr> <bin-op> on(<labels>) <vector expr>

Example:

sum by(app,machine) (count_over_time({app="foo"}[1m])) + on(app) sum by (app) (count_over_time({app="bar"}[1m]))

@KMiller-Grafana
Copy link
Contributor

Sorry @garrettlish, github formatted my rewrite. I wanted to give you the source to use.

@garrettlish
Copy link
Contributor Author

@KMiller-Grafana thanks for your comments, reformat the doc based on your suggestion - 0ab8264, please help review it again, thanks!

Copy link
Contributor

@KMiller-Grafana KMiller-Grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the doc rewrite. One more item is needed. Please add a sentence or 2 just before each example, describing what the example does. Without that, an example becomes a syntax example. A sentence or two will help docs readers to understand the larger picture of what the example shows, and why these keywords are useful.

@garrettlish
Copy link
Contributor Author

@KMiller-Grafana sorry, I missed your first comment 😭, added description for each example: b837d8e

garrettlish and others added 4 commits October 2, 2021 08:24
* use a scoped error to replace mutate stepEvaluator via type switching
* change VectorMatching Include from empty string array to nil
* prohibit sharding when we're changing the label groupings such as on or ignoring
* add unit test to cover multiple matches for labels error
* fix expr.y format issue
* reformat on and ignoring keywords doc
* add sharding unit test for on and ignoring keywords
@owen-d owen-d dismissed KMiller-Grafana’s stale review October 5, 2021 14:22

will amend docs in a second pass

@owen-d owen-d merged commit 4417340 into grafana:main Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants