Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named parameter support and update Thrift #155

Merged
merged 2 commits into from
Aug 29, 2023

Conversation

rcypher-databricks
Copy link
Contributor

@rcypher-databricks rcypher-databricks commented Aug 16, 2023

Updated thrift client code to latest version.
Added support for named query parameters.
Not yet complete. The correct sql type needs to be set in TSparkParameter struct. Need to add support for date/time and intervals.

Updated thrift client code to latest version.
Added support for named query parameters.

Signed-off-by: Raymond Cypher <raymond.cypher@databricks.com>
Copy link
Contributor

@yunbodeng-db yunbodeng-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with odbc/jdbc (which actually will not have named param support), do we have positional param support on the roadmap?
I understand the server support is work in progress but i think the client side support should be very similar between two. We can introduce a switch to turn positional param on when the server is done.

Another question is: do we check server versions? I should find out the versions that named param and positional params are supported. Someone on BIX-internal must know.

@mattdeekay
Copy link
Contributor

To be consistent with odbc/jdbc (which actually will not have named param support), do we have positional param support on the roadmap? I understand the server support is work in progress but i think the client side support should be very similar between two. We can introduce a switch to turn positional param on when the server is done.

Another question is: do we check server versions? I should find out the versions that named param and positional params are supported. Someone on BIX-internal must know.

We do have client-side positional param support with this change. The Go SQL spec allows users to add positional OR named parameters like this:
c.ExecContext(ctx, query, args ...any)

In order for customers to use named parameters, they might do this:
c.ExecContext(ctx, "SELECT :param_a, :param_b FROM sample", NamedArg{Name: "param_a", Value: "test_value_a"}, NamedArg{Name: "param_b", Value: "test_value_b"})

But the driver will receive []driver.NamedValue for both positional and named. It's up to the driver to send the correct TSparkParameter to the thrift server.

@@ -337,6 +334,83 @@ func (c *conn) executeStatement(ctx context.Context, query string, args []driver
return resp, err
}

func namedValuesToTSparkParams(args []driver.NamedValue) []*cli_service.TSparkParameter {
var ts []string = []string{"STRING", "DOUBLE", "BOOLEAN"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should all number values be "DOUBLE"? What about INT, or DECIMAL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment TSparkParameterValue only support bool, double and string values. I'm waiting for clarification on the question of losing precision when converted to double. Also waiting for clarification on whether TSparkParameter.Type should correspond to the type in TSparkParameterValue or if it is used for converting the value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the code I saw in DBR, TSparkParameter.Type should correspond to the data types listed here. In DBR, there will be an attempt to cast these to the correct type. Floats can even be represented as a string so we could do this (pseudocode):

TSparkParameter{Type: "FLOAT", Value: TSparkParameter{stringValue: "1234.125"}, ...}

Or

TSparkParameter{Type: "DECIMAL(6, 2)", Value: TSparkParameter{stringValue: "1234.12"}, ...}

Or

TSparkParameter{Type: "BOOLEAN", Value: TSparkParameter{stringValue: "false"}, ...}

I think "ARRAY", "MAP", "STRUCT", and "BINARY" are not supported here however

@@ -194,7 +194,7 @@ func WithDefaults() *Config {
DriverName: "godatabrickssqlconnector", // important. Do not change
ThriftProtocol: "binary",
ThriftTransport: "http",
ThriftProtocolVersion: cli_service.TProtocolVersion_SPARK_CLI_SERVICE_PROTOCOL_V6,
ThriftProtocolVersion: cli_service.TProtocolVersion_SPARK_CLI_SERVICE_PROTOCOL_V7,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was able to get NamedParameters working against e2-dogfood SQL without updating the ThriftProtocolVersion. Wondering if we know what changing this would achieve?

@rcypher-databricks rcypher-databricks marked this pull request as ready for review August 28, 2023 22:04
Signed-off-by: Raymond Cypher <raymond.cypher@databricks.com>
Copy link
Contributor

@andrefurlan-db andrefurlan-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. Building blocks :-)

@rcypher-databricks rcypher-databricks merged commit d1e4dc5 into databricks:main Aug 29, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants