-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CALCITE-6804 Anti-join with WHERE NOT EXISTS syntax has corrupted condition #4177
base: main
Are you sure you want to change the base?
Conversation
@@ -456,7 +458,10 @@ public Result visit(Filter e) { | |||
builder.context.toSql(null, e.getCondition()))); | |||
return builder.result(); | |||
} else { | |||
final Result x = visitInput(e, 0, Clause.WHERE); | |||
Result x = visitInput(e, 0, Clause.WHERE); | |||
if (e.getCondition().getKind() == NOT || e.getCondition().getKind() == EXISTS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these the only conditions that require a reset alias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this doesn't cover all cases. e.g.
e2."product_id" = selected."product_id" and e2."product_id" > 10
How about resetting on parseCorrelTable?
+++ b/core/src/main/java/org/apache/calcite/rel/rel2sql/RelToSqlConverter.java
@@ -1406,7 +1406,7 @@ public List<SqlNode> createAsFullOperands(RelDataType rowType, SqlNode leftOpera
private void parseCorrelTable(RelNode relNode, Result x) {
for (CorrelationId id : relNode.getVariablesSet()) {
- correlTableMap.put(id, x.qualifiedContext());
+ correlTableMap.put(id, x.resetAlias().qualifiedContext());
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these the only conditions that require a reset alias?
My take here is that typically predicates can't access the input via aliases. At least, for many SQL engines user ought to move the input into CTE to refer to it in filtering conditions. And JOINs with EXISTS
/NOT EXISTS
syntax are exceptions in that regard.
I thought about explicit function with name like canAccessInputViaAlias
that processes condition. But it seems too nit-pick'y given the fact that I am sure only about two greenlighting keywords
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suibianwanwank
Can you please elaborate on what you suspected?
I added the predicate you mentioned:
@Test void testAntiJoinWithComplexInput() {
final String sql = "SELECT * FROM "
+ "(select * from ("
+ "select e1.\"product_id\" FROM \"foodmart\".\"product\" e1 "
+ "LEFT JOIN \"foodmart\".\"product\" e3 "
+ "on e1.\"product_id\" = e3.\"product_id\""
+ ")"
+ ") selected where not exists\n"
+ "(select 1 from \"foodmart\".\"product\" e2 "
+ "where e2.\"product_id\" = selected.\"product_id\" and e2.\"product_id\" > 10)";
final String expected =
"SELECT *\nFROM (SELECT \"product\".\"product_id\"\nFROM \"foodmart\".\"product\"\n"
+ "LEFT JOIN \"foodmart\".\"product\" AS \"product0\" "
+ "ON \"product\".\"product_id\" = \"product0\".\"product_id\") AS \"t\"\n"
+ "WHERE NOT EXISTS ("
+ "SELECT *\nFROM \"foodmart\".\"product\"\n"
+ "WHERE \"product_id\" = \"t\".\"product_id\" AND \"product_id\" > 10"
+ ")";
sql(sql).ok(expected);
}
Without escaping mess:
SELECT * FROM
foodmart.product
WHERE
product_id = t.product_id AND product_id > 10
Seems good to me.
How about resetting on parseCorrelTable?
Feels unsafe to me. Particularly due to fact that such aliases are not legitimate in all cases(why we should represent and maintain state that isn't reflecting possible query).
I can try your suggestion to see what tests think about it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, you are right. I misunderstood the meaning here.
Please use a commit message and a PR title that match exactly the issue in JIRA. |
4683dc6
to
18b8feb
Compare
18b8feb
to
ebc7904
Compare
@antonkw Maybe I’m overthinking it, but I still believe that only considering NOT and EXISTS doesn't cover all possible cases for subqueries. Could you try the example with
and see if it works? |
ebc7904
to
52e1fa8
Compare
here we go again :) Added @Test void testFilterWithSubQuery() {
final String sql = "SELECT * FROM "
+ "(select * from ("
+ "select e1.\"product_id\" FROM \"foodmart\".\"product\" e1 "
+ "LEFT JOIN \"foodmart\".\"product\" e3 "
+ "on e1.\"product_id\" = e3.\"product_id\""
+ ")"
+ ") selected where 1 in\n"
+ "(select \"gross_weight\" from \"foodmart\".\"product\" e2 "
+ "where e2.\"product_id\" = selected.\"product_id\" and e2.\"product_id\" > 10)";
final String expected =
"SELECT *\nFROM (SELECT \"product\".\"product_id\"\nFROM \"foodmart\".\"product\"\n"
+ "LEFT JOIN \"foodmart\".\"product\" AS \"product0\" "
+ "ON \"product\".\"product_id\" = \"product0\".\"product_id\") AS \"t\"\n"
+ "WHERE CAST(1 AS DOUBLE) IN ("
+ "SELECT \"gross_weight\"\nFROM \"foodmart\".\"product\"\n"
+ "WHERE \"product_id\" = \"t\".\"product_id\" AND \"product_id\" > 10)";
sql(sql).ok(expected);
} Waiting for CI to see if everything |
52e1fa8
to
418fb83
Compare
|
@suibianwanwank |
I don't expect that it can be merged as is.
In the first order, I want to demonstrate the issue with anti joins: https://issues.apache.org/jira/browse/CALCITE-6804
For any anti-join with input (left-side) more complex than single table scan, rex sub query can't be converted to SQL.
Reason: Filter's node input is not considered as something that can be referenced (via alias) in condition.
But anti-join is that exact case.
Example of logical plan:
$cor0
should eventually refer to whole input aliased ast
.Without proposed change
$cor0
is being converted intoJdbcTableScan(table=[[foodmart, product]])