Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES|QL: Skip CASE function from InferIsNotNull rule checks #113123

Merged
merged 11 commits into from
Sep 20, 2024

Conversation

astefan
Copy link
Contributor

@astefan astefan commented Sep 18, 2024

Fixes #112704 by excluding case function from the InferIsNotNull local optimizer rule, because of the potential complexity of the case logic conditions.

@astefan astefan requested a review from a team as a code owner September 18, 2024 14:42
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Sep 18, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @astefan, I've created a changelog YAML for you.

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@@ -106,6 +107,6 @@ private boolean doResolve(Expression exp, AttributeMap<Expression> aliases, Set<
}

private static boolean skipExpression(Expression e) {
return e instanceof Coalesce;
return e instanceof Coalesce || e instanceof Case;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: in case any other change is required to the PR, maybe move the comment from the call site as function javadoc/comment.

Comment on lines +487 to +517
public void testIsNotNullOnCoalesce() {
var plan = localPlan("""
from test
| where coalesce(emp_no, salary) is not null
""");

var limit = as(plan, Limit.class);
var filter = as(limit.child(), Filter.class);
var inn = as(filter.condition(), IsNotNull.class);
var coalesce = as(inn.children().get(0), Coalesce.class);
assertThat(Expressions.names(coalesce.children()), contains("emp_no", "salary"));
var source = as(filter.child(), EsRelation.class);
}

public void testIsNotNullOnExpression() {
var plan = localPlan("""
from test
| eval x = emp_no + 1
| where x is not null
""");

var limit = as(plan, Limit.class);
var filter = as(limit.child(), Filter.class);
var inn = as(filter.condition(), IsNotNull.class);
assertThat(Expressions.names(inn.children()), contains("x"));
var eval = as(filter.child(), Eval.class);
filter = as(eval.child(), Filter.class);
inn = as(filter.condition(), IsNotNull.class);
assertThat(Expressions.names(inn.children()), contains("emp_no"));
var source = as(filter.child(), EsRelation.class);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just checking: these are moved only, right?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry. I forgot to add a comment after I created the PR.

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @astefan, it works and it fixes a problem, so it's a 👍 for me, but I'd like to have a bit more context (at least a comment) on the logic behind this functions blacklist

@@ -106,6 +107,6 @@ private boolean doResolve(Expression exp, AttributeMap<Expression> aliases, Set<
}

private static boolean skipExpression(Expression e) {
return e instanceof Coalesce;
return e instanceof Coalesce || e instanceof Case;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to understand why these two functions are special, at least with a comment, and eventually extract a general rule (a flag interface or a method in the Expression class to identify the functions that fall into this category).
The only patterns I can see are:

  • both have Nullability.UNKNOWN (but also BinaryLogic does, yet it's not listed here). The javadoc mentions unknown nullability as well: Simplify IsNotNull targets by resolving the underlying expression to its root fields with unknown nullability...
  • both return the value of one of the expressions passed as params (but other functions do as well, eg. GREATEST())

Can you please elaborate a bit on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After spending some time on this issue when it was initially reported, my thinking is the following:

  • Nullability has nothing to do with this optimization. Nullability is most useful in the LogicalPlanOptimizer to shortcircuit folding to null expressions. At this point, in the local logical optimizer, the Nullability did its job.
  • the fact that Nullability is the same (and unknown) for these two functions (for coalesce it can also be false) is a fact indeed, but conceptually speaking I think it's not relevant
  • these two functions have special behavior in the sense that they can mess up the is not null predicate when it's used on them. They can accept complex expressions where the fields used can also be checked for null. For example, case (x is null,.....) is not null cannot be transformed in case(....) is not null AND x is not null because there is already an expression that contradicts x is null.
    • I have tried to come up with a smarter way of evaluating the expressions inside a case and determine if the fields there end up as x is null but it proved to be more complicated and I found this not worth the effort now. Maybe we can come back to this and be smarter about case and coalesce but we need a stronger reason to do it. Also, this bug is serious enough that it warrants this simple exclusion and fix.
  • the flag interface or a method in the Expression class at this point in the life of ES|QL (where we welcome any contribution from outside) is, imo, not warranted: we will make the behavior of a function more complex than it needs and it will confuse contributors even more. For now, I prefer to have this logic centralized in an optimization rule. Functions land should be a more approachable one, optimization rules are left for advanced logic and knowledge.

I will add a comment in the PR for this method, @bpintea also mentioned this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two functions have special behavior in the sense that they can mess up the is not null

I don't completely agree with this statement. These two functions mess up with this rule, not with IsNotNull.

They can accept complex expressions where the fields used can also be checked for null

This is closer to the point IMHO: the real peculiarity is that these functions have a custom behavior with null inputs, ie. even if one input is null, the function can still be not null, so func(input1, input2, ...) IS NOT NULL is not equivalent to input1 IS NOT NULL and func(input1, input2, ...) IS NOT NULL.
I think we can easily express this as a method.

The danger comes from the fact that this is an implementation detail, peculiar to each function.
Or better, it's how our code generator creates evaluators (Coalesce and Case have a custom evaluator, not generated).
This makes the risk pretty low for now, that's why IMHO it's still good as a fix, but IMHO we should make this rule more robust as a follow-up.

the flag interface or a method in the Expression class at this point in the life of ES|QL (where we welcome any contribution from outside) is, imo, not warranted: we will make the behavior of a function more complex than it needs and it will confuse contributors even more

My concern is that a contributor could introduce a new function with a similar behavior, and will unintentionally break this rule.

Also, this bug is serious enough that it warrants this simple exclusion and fix.
💯

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's move it forward.
See also my comment below, IMHO there is margin for improvements

@@ -106,6 +107,6 @@ private boolean doResolve(Expression exp, AttributeMap<Expression> aliases, Set<
}

private static boolean skipExpression(Expression e) {
return e instanceof Coalesce;
return e instanceof Coalesce || e instanceof Case;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two functions have special behavior in the sense that they can mess up the is not null

I don't completely agree with this statement. These two functions mess up with this rule, not with IsNotNull.

They can accept complex expressions where the fields used can also be checked for null

This is closer to the point IMHO: the real peculiarity is that these functions have a custom behavior with null inputs, ie. even if one input is null, the function can still be not null, so func(input1, input2, ...) IS NOT NULL is not equivalent to input1 IS NOT NULL and func(input1, input2, ...) IS NOT NULL.
I think we can easily express this as a method.

The danger comes from the fact that this is an implementation detail, peculiar to each function.
Or better, it's how our code generator creates evaluators (Coalesce and Case have a custom evaluator, not generated).
This makes the risk pretty low for now, that's why IMHO it's still good as a fix, but IMHO we should make this rule more robust as a follow-up.

the flag interface or a method in the Expression class at this point in the life of ES|QL (where we welcome any contribution from outside) is, imo, not warranted: we will make the behavior of a function more complex than it needs and it will confuse contributors even more

My concern is that a contributor could introduce a new function with a similar behavior, and will unintentionally break this rule.

Also, this bug is serious enough that it warrants this simple exclusion and fix.
💯

@astefan
Copy link
Contributor Author

astefan commented Sep 19, 2024

@elasticmachine run elasticsearch-ci/part-4

@astefan
Copy link
Contributor Author

astefan commented Sep 19, 2024

@elasticmachine run elasticsearch-ci/bwc-snapshots

@astefan astefan merged commit 2cccf13 into elastic:main Sep 20, 2024
15 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts
8.15 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 113123

@astefan astefan deleted the 112704_fix branch September 20, 2024 07:22
astefan added a commit to astefan/elasticsearch that referenced this pull request Sep 20, 2024
…3123)

* Skip CASE function from InferIsNotNull rule checks

(cherry picked from commit 2cccf13)
astefan added a commit to astefan/elasticsearch that referenced this pull request Sep 20, 2024
…3123)

* Skip CASE function from InferIsNotNull rule checks

(cherry picked from commit 2cccf13)
elasticsearchmachine pushed a commit that referenced this pull request Sep 20, 2024
…113249)

* Skip CASE function from InferIsNotNull rule checks

(cherry picked from commit 2cccf13)
elasticsearchmachine pushed a commit that referenced this pull request Sep 20, 2024
…113252)

* Skip CASE function from InferIsNotNull rule checks

(cherry picked from commit 2cccf13)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL backport pending >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.15.3 v8.16.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ES|QL NULL check in CASE clause breaks later WHERE
4 participants