-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SQL planner support for Like
, ILike
and SimilarTo
, with optional escape character
#3101
Conversation
Like
a top-level Expr
and add SQL support for ILike
and SimilarTo
Like
a top-level Expr
and add SQL support for ILike
and SimilarTo
Codecov Report
@@ Coverage Diff @@
## master #3101 +/- ##
==========================================
+ Coverage 85.58% 85.64% +0.06%
==========================================
Files 296 296
Lines 54252 54362 +110
==========================================
+ Hits 46432 46561 +129
+ Misses 7820 7801 -19
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Like
a top-level Expr
and add SQL support for ILike
and SimilarTo
Like
, ILike
and SimilarTo
, with optional escape character
@ayushdg PTAL when you can |
Expr::Like { pattern, .. } | ||
| Expr::ILike { pattern, .. } | ||
| Expr::SimilarTo { pattern, .. } => match pattern.get_type(&self.schema)? { | ||
DataType::Utf8 => Ok(expr.clone()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about LargeUtf8
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the regular expression pattern, which is typically a literal string but could, in theory, be a column reference. I'm not sure if it would make sense to support LargeUtf8 here?
} | ||
Ok(Expr::Like { | ||
negated, | ||
expr: Box::new(self.sql_expr_to_logical_expr(*expr, schema, ctes)?), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
}) | ||
} | ||
Expr::Like { pattern, .. } | ||
| Expr::ILike { pattern, .. } | ||
| Expr::SimilarTo { pattern, .. } => match pattern.get_type(&self.schema)? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like the wrong place to be type checking the argument to Expr::Like
, etc. as the argument types to other exprs are checked so why would we treat Expr::Like
differently?
I would expect that to be done in the SQL planner perhaps? or in the physical_expr conversion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't check the type of the pattern
expression until we have a schema to resolve against (it could be a reference to a column or any other type of expression).
My goal was to do the validation in the logical plan for the benefit of other engines building on DataFusion that don't use the physical plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, we do have the schema in SQL planning ... I will make that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb Ok, this PR just got a whole lot smaller. Maybe this really shouldn't have taken me 30 days to do 🤣
This is pretty neat, by the way 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me 😍
@alamb Ok, this PR just got a whole lot smaller. Maybe this really shouldn't have taken me 30 days to do 🤣
Well, I think it had a bunch of dependencies that are all now sorted out, so the final PR "makes it look easy" even though a bunch of effort was needed
} | ||
let pattern = self.sql_expr_to_logical_expr(*pattern, schema, ctes)?; | ||
let pattern_type = pattern.get_type(schema)?; | ||
if pattern_type != DataType::Utf8 && pattern_type != DataType::Null { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Benchmark runs are scheduled for baseline = 73447b5 and contender = c96f03e. c96f03e is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
…ional escape character (apache#3101) * Make Like a top-level Expr * revert some changes * add type validation * Revert physical plan changes and reduce scope of the PR * Revert more changes * Revert more changes * clippy * address feedback * revert change to test * revert more changes
…ional escape character (apache#3101) * Make Like a top-level Expr * revert some changes * add type validation * Revert physical plan changes and reduce scope of the PR * Revert more changes * Revert more changes * clippy * address feedback * revert change to test * revert more changes
Which issue does this PR close?
Closes #3099
Rationale for this change
The Dask SQL project would like to support queries using
LIKE
,ILIKE
,SIMILAR TO
syntax in Postgres, with an optional escape character specified, so we want to add this support to the SQL query planner and the logical plan.What changes are included in this PR?
We originally modeled
LIKE
andNOT LIKE
as binary operations but this cannot support the optional escape character. For that reason, this PR adds new top-level expressions, but the old ones remain for backwards compatibility.There is no support in the physical planner yet for supporting the optional escape character.
Are there any user-facing changes?
Yes. This is an API change.