-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql support for dynamic parameters #6974
Conversation
One of the primary motivators was to provide a mechanism to supply very large literals (serialized bloom filters) to a Druid query without choking the tokenizer with a 10MB+ string literal, as the timings of this test show it's a nice improvement: I meant to mark the slow test as ignore though so it doesn't get run, so will update PR. |
I found my way here via CALCITE-2873 and am receiving the same AvaticaSqlException when attempting to prepare statements against my Druid cluster. Two questions: 1) How broad is the scope of fix in this PR? (i.e., I'm wondering if my use-case will likely be covered by the fix) and 2) Is there a best-guess timeline for when it will be merged into apache:master? Many thanks! |
@peterlittig, I expect this will probably be merged to master within the next couple of weeks, barring any major issues uncovered during review, and should be included in release 0.15 (which is quickly sneaking up on us even though 0.14 isn't quite finalized yet). I'm not sure on your use case, but I hope so? I've attempted this PR to add general purpose support for dynamic parameters to Druid sql queries through both the JSON API and JDBC interfaces. I've added a handful of test cases for a variety of expressions to the new in this PR CalciteParameterQueryTest, have a look to see if the queries you're trying to do look similar to any of those, and if not I'd be happy to add a test case to make sure it works correctly. One thing to note, sometimes dynamic parameters can trip up the type inference, which will result in a query failing to validate before it can plan. As far as I can tell, this is largely on the calcite side, as queries will confuse the
In this example, the first |
Thanks for the response, @clintropolis! The use-cases I currently have in mind are fairly simple/straightforward. I'll definitely take a look at the test class you linked and see I'm covered. Thanks also for the timeline estimate. It would be great to see this go out in 0.15! |
@glasser I have modified the parameters in the http call to no longer have the ordinal property, in favor of the more user friendly approach you suggested of using list order. I have updated the PR description (and docs) to reflect this. |
} | ||
|
||
@Test | ||
public void testLongs() throws Exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test will fail CI in sql compatible mode until #9251 is merged
docs/querying/sql.md
Outdated
@@ -607,7 +641,7 @@ Properties connectionProperties = new Properties(); | |||
try (Connection connection = DriverManager.getConnection(url, connectionProperties)) { | |||
try ( | |||
final Statement statement = connection.createStatement(); | |||
final ResultSet resultSet = statement.executeQuery(query) | |||
final ResultSet resultSet = statement.executeQuery(query); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original example is better as the semicolon is not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
@@ -273,10 +275,67 @@ public void testBloomFilters() throws Exception | |||
); | |||
} | |||
|
|||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per #6974 (comment), it'd be good to ignore this slow test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marked both as ignored
this.maxRowCount = maxRowCount; | ||
this.query = query; | ||
ArrayList<AvaticaParameter> params = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Declare as List
instead of ArrayList
to separate behavior from implementation
return Long.class; | ||
case REAL: | ||
return Float.class; | ||
case FLOAT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment for why this is not Float.class
would be helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added link to docs I used as reference for mappings from JDBC to Java types
} | ||
|
||
@Override | ||
public boolean equals(Object o) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest adding an EqualsVerifier
unit test for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
@@ -85,13 +110,14 @@ public boolean equals(final Object o) | |||
return header == sqlQuery.header && | |||
Objects.equals(query, sqlQuery.query) && | |||
resultFormat == sqlQuery.resultFormat && | |||
Objects.equals(context, sqlQuery.context); | |||
Objects.equals(context, sqlQuery.context) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest adding an EqualsVerifier
unit test for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
*/ | ||
public class CalciteParameterQueryTest extends BaseCalciteQueryTest | ||
{ | ||
private final boolean useDefault = NullHandling.replaceWithDefault(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be beneficial to test binding a null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, thanks for asking as the current implementation wasn't accepting null valued parameters because a year ago when I made this branch we didn't really publicly acknowledge that null could exist, and it had a precondition check that value was not in fact null. I've adjusted this to accept null parameters.
However, since there is no null equivalence in sql, it doesn't really seem to be all that useful, and in all cases I could think of you need a different expression to be able to explicitly use a null parameter, making using a parameter to pass null basically pointless afaict unless I'm missing some obvious case.
Additionally, calcite seems to fail to parse stuff like:
SELECT COUNT(*)
FROM table
WHERE column IS ?
producing an error of the form:
org.apache.calcite.sql.parser.SqlParseException: Encountered "IS ?" at line 3, column 10.1
I did add a couple of contrived examples, one using coalesce that optimizes out the parameter if it is null since coalesce returns first non-null, and a bloom filter test which is optimized out because it ends up being a constant, just to test the path of a null parameter getting fed into things.
@@ -275,6 +276,7 @@ public void testBloomFilters() throws Exception | |||
); | |||
} | |||
|
|||
@Ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an explanation of why it's ignored (the annotation takes a value, which can be used for the comment).
@@ -302,6 +304,7 @@ public void testBloomFilterBigNoParam() throws Exception | |||
); | |||
} | |||
|
|||
@Ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about adding explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Two suggestions and a typo.
docs/querying/sql.md
Outdated
@@ -630,6 +663,17 @@ the necessary stickiness even with a normal non-sticky load balancer. Please see | |||
|
|||
Note that the non-JDBC [JSON over HTTP](#json-over-http) API is stateless and does not require stickiness. | |||
|
|||
### Dynamic Parameters | |||
|
|||
Parameterized queries are supported with JDBC: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misread this at first as meaning JDBC enables parameterized queries somehow. Is this wording clearer:
"You can use parameterized queries in JDBC code, as in this example;"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
docs/querying/sql.md
Outdated
|
||
|Property|Type|Description|Required| | ||
|--------|----|-----------|--------| | ||
|`type`|`String` (`SqlType`) | String value of `SqlType` of parameter. [`SqlType`](https://calcite.apache.org/avatica/apidocs/org/apache/calcite/avatica/SqlType.html) is an friendly wrapper around [`java.sql.Types`](https://docs.oracle.com/javase/8/docs/api/java/sql/Types.html?is-external=true)|yes| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "...is a friendly wrapper..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
docs/querying/sql.md
Outdated
@@ -55,6 +55,9 @@ like `100` (denoting an integer), `100.0` (denoting a floating point value), or | |||
timestamps can be written like `TIMESTAMP '2000-01-01 00:00:00'`. Literal intervals, used for time arithmetic, can be | |||
written like `INTERVAL '1' HOUR`, `INTERVAL '1 02:03' DAY TO MINUTE`, `INTERVAL '1-2' YEAR TO MONTH`, and so on. | |||
|
|||
Druid SQL supports dynamic parameters using the `?` syntax where parameters are bound to `?` in order. Replace any | |||
literal with a `?` and supply parameters to the query and the values will be bound at execution time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: This description feels like it could be expanded upon just a bit, just to draw out what is meant by "in order," for one thing. How about something like:
"Druid SQL supports dynamic parameters in question mark (?
) syntax, where parameters are bound to the ?
placeholders at execution time. To use dynamic parameters, replace any literal in the query with a ?
character and ensure that corresponding parameter values are provided at execution time. Parameters are bound to the placeholders in the order in which they are passed."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm, changed
this.parameters = parameters; | ||
} | ||
|
||
public PrepareResult prepare(AuthenticationResult authenticationResult) throws ValidationException, RelConversionException, SqlParseException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please break the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
private AvaticaParameter createParameter(RelDataTypeField field, RelDataType type) | ||
{ | ||
return new AvaticaParameter( | ||
false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you add a comment on why this is always false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment
@@ -142,6 +150,11 @@ public DruidStatement( | |||
return columns; | |||
} | |||
|
|||
public void setParameters(List<TypedValue> parameters) | |||
{ | |||
this.parameters = parameters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this needs to lock lock
.. Maybe parameters
can be passed to execute()
directly without having this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, reworked to make execute
take a list of parameters to avoid messing with the lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks for review all 👍 |
This PR adds support for parameterized SQL queries of the form:
Over the HTTP API :
JDBC:
Design
Dynamic parameter binding for Druid queries is performed in a "query optimized" (rather than "planner optimized") manner which substitutes the dynamic parameters (
SqlDynamicParam
andRexDynamicParam
) nodes of the expression with the literals of the value that they are bound to as early as we can. This lets us maximize simplification of the query expression as much as possible prior to execution, which is likely always going to be better for query performance. For JDBC, this pushes the actual planning into the 'execute' call, so re-using a prepared statement with different parameters will result in re-planning the query, which is what I meant by the earlier assertion that this approach is "query optimized" instead of "planner optimized". Since planning is expected to be cheap compared to the actual query, this seems like the correct approach to take here, especially since these unoptimized queries could result in either complex or un-necessary virtual column expressions at execution time.This is done through two layers of expression tree transformation, the first
SqlParametizerShuttle
transforming theSqlNode
expression, and the secondRelParameterizerShuttle
transforming theRelNode
expression after theSqlNode
has been parsed and validated and translated into a relational form. Through experimentation I have got the first transformation to catch all of the test cases I have so far, but the ability to create aSqlLiteral
from arbitrary JDBC types is less rich than the facilities to create aRexLiteral
, so I still have the second transformation in place, just in case. Some additional details are available in a Calcite dev-list thread about how to approach this problem.