-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/ssl support dbtspark #169
Feature/ssl support dbtspark #169
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rahulgoyal2987 Thanks for the PR!
I'm not opposed to the changes you're proposing here; my main concern is with the new dependency (https://github.com/devinstevenson/pure-transport). I see it has a small number of stars/users/contributors. Is this a package you've relied on for other projects?
FWIW it does seem tailor-made to the our use case, given the reliance on PyHive
for thrift
+ http
connections.
requirements.txt
Outdated
@@ -3,3 +3,4 @@ PyHive[hive]>=0.6.0,<0.7.0 | |||
pyodbc>=4.0.30 | |||
sqlparams>=3.0.0 | |||
thrift>=0.11.0,<0.12.0 | |||
pure-transport>=0.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we put an upper bound on the version here, since it's version-0 software?
You'll need to add this to the extra setup requirements as well:
https://github.com/fishtown-analytics/dbt-spark/blob/dff1b613ddf87e4e72e8a47475bcfd1d55796a5c/setup.py#L41-L44
I'm inclined to bundle it in with the larger PyHive
extra, unless there's a good reason to bundle it separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jtcohen6 I have added the logic to build transport object rather than using pure-transport
@jtcohen6 Major use of pure-transport dependency (https://github.com/devinstevenson/pure-transport) It is providing support for TSSLSocket which pyhive is not providing. I have added the logic to build transport object and remove pure-transport dependency |
@rahulgoyal2987 This is really neat! Have you managed to test this with your own projects, and confirm that it works as expected? |
@jtcohen6 I tested with my own project and setup and it works as expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rahulgoyal2987 Thanks for all the work here! This isn't functionality I'm able to test directly, so I appreciate the unit test, and your confirmation that this works for your use case.
My only comments are around package imports: making sure the dependencies are appropriately installed alongside PyHive
extras, or handled if missing.
Could you also:
- Add
use_ssl
to the README as a supported connection parameter - Add a Changelog entry (under
dbt next
), and add yourself to the list of contributors
dbt/adapters/spark/connections.py
Outdated
from thrift.transport.TSSLSocket import TSSLSocket | ||
import thrift | ||
import ssl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's include these up in the try
import (line 11) for requirements that may or may not be installed.
ssl
is a net-new dependency, right? We'll need to add it in setup.py
, within the Pyhive
extra
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ssl is part of python library https://docs.python.org/3/library/ssl.html so i think it is not required to add
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I said Just kidding, I follow now, those are dependencies of ssl
here but I was thinking of thrift_sasl
. That's a net-new dependency, right?PyHive[hive]
. Thanks for the clarification around ssl
.
dbt/adapters/spark/connections.py
Outdated
# Defer import so package dependency is optional | ||
import sasl | ||
import thrift_sasl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a different way of handling these optional imports. See the try
logic at the top of the file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Could you move these up alongside the other imports (lines 29-32)? I'd prefer not to have imports nested so far down in the fil.
Done |
…987/dbt-spark into feature/ssl-support-dbtspark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the contribution @rahulgoyal2987
resolves #
Description
Checklist
CHANGELOG.md
and added information about my change to the "dbt next" section.