-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Destination databricks: switch to oss jdbc driver #44033
base: master
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
4fcf030
to
98e94bf
Compare
This stack of pull requests is managed by Graphite. Learn more about stacking. |
8c959c8
to
86e4091
Compare
98e94bf
to
0993074
Compare
86e4091
to
75ee096
Compare
0993074
to
53895b0
Compare
75ee096
to
b74867f
Compare
53895b0
to
ee2e5aa
Compare
2af315e
to
d4753f6
Compare
@@ -44,18 +44,23 @@ object DatabricksConnectorClientsFactory { | |||
// EnableArrow=0 flag is undocumented and disables ArrowBuf when reading data | |||
// Destinations only reads data for metadata or for comparison of actual data in tests. so | |||
// we don't need it to be optimized. | |||
val jdbcUrl = | |||
"jdbc:databricks://${config.hostname}:${config.port}/${config.database};transportMode=http;httpPath=${config.httpPath};EnableArrow=0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gisripa do you remember what the /${config.database}
thing is supposed to do? It sounds like we're passing in the database (i.e. unity catalog name?)
but the oss driver's docs sound like this is actually supposed to be the schema, which sounds super weird. I.e. their url format is jdbc:databricks://<server-hostname>:<port>/<schema>;
(I'm seeing some weird test failures with ConcurrentModificationException and am very confused about how that could be happening, no idea if this is related)
d4753f6
to
6ad7fd6
Compare
5555212
to
6ae90f0
Compare
6ad7fd6
to
8ae325e
Compare
6ae90f0
to
7fc4b42
Compare
8ae325e
to
82e9adf
Compare
8508c4d
to
3edff7b
Compare
82e9adf
to
b9168c4
Compare
b9168c4
to
0aefd8f
Compare
closes https://github.com/airbytehq/airbyte-internal-issues/issues/9120
also it seems like we don't need the databricks sdk at all?
The new driver has a slightly different interface (you can't directly supply a URL, it forces you to supply individual fields/properties). I tried to port over our existing stuff, but removed the
transportMode=http
andEnableArrow=0
things to see if they're still needed.Databricks documentation doesn't even describe how to do oauth, it only says how to do PAT (https://docs.gcp.databricks.com/en/integrations/jdbc/oss.html#authenticate-the-driver). I copied our old stuff to the new interfaces naively, but it doesn't work
DatabricksSQLException: Communication link failure. Failed to connect to server. :https://dbc-6aebf761-f8d6.cloud.databricks.com:443accessToken must be defined
DatabricksSQLException: Communication link failure. Failed to connect to server. :https://dbc-6aebf761-f8d6.cloud.databricks.com:443Cannot invoke "com.databricks.sdk.core.oauth.OpenIDConnectEndpoints.getTokenEndpoint()" because "jsonResponse" is null
).notable changes in the oss driver:
.000
precisionInline byte limit exceeded. Statements executed with disposition=INLINE can have a result size of at most 26214400 bytes. Please execute the statement with disposition=EXTERNAL_LINKS if you want to download the full result
which means: