Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive Ingestion broken over HTTP #8405

Closed
travis-cook-sfdc opened this issue Jul 12, 2023 · 1 comment · Fixed by #8570
Closed

Hive Ingestion broken over HTTP #8405

travis-cook-sfdc opened this issue Jul 12, 2023 · 1 comment · Fixed by #8570
Labels
bug Bug report

Comments

@travis-cook-sfdc
Copy link

Describe the bug

Hive ingestion won't work if using hive+http or hive+https unless you explicitly downgrade Thrift to 0.13.0.

Issues aren't supported in pyhive so I couldn't raise it there: https://github.com/acryldata/PyHive

The fix should be to disallow the buggy versions in setup.py: https://github.com/acryldata/PyHive/blob/master/setup.py#L62

Here's the related pyhive issue (but that project is pretty dead, so maybe it could get fixed here?)
dropbox/PyHive#417

To Reproduce
Steps to reproduce the behavior:

  1. Create a hive.dhub.yml file like this:
source:
    type: hive
    host_port: localhost:10000
    scheme: hive+http
  1. Run datahub ingest run -c hive.dhub.yml
  2. See Error:
File "/usr/local/lib/python3.10/site-packages/TCLIService/TCLIService.py", line 186, in OpenSession
    self.send_OpenSession(req)
File "/usr/local/lib/python3.10/site-packages/TCLIService/TCLIService.py", line 195, in send_OpenSession
    self._oprot.trans.flush()
File "/usr/local/lib/python3.10/site-packages/pyhive/hive.py", line 81, in flush
    super(TCookieHttpClient, self).flush()
File "/usr/local/lib/python3.10/site-packages/thrift/transport/THttpClient.py", line 191, in flush
    self.__http.putheader('Cookie', self.headers['Set-Cookie'])
File "/usr/local/lib/python3.10/http/client.py", line 1244, in putheader
    raise CannotSendHeader()

Expected behavior
HIve ingestion should work

Workaround
You can get around this behavior by explicitly downgrading thirft to 0.13.0

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: MacOSX
  • Browser: N/A
  • Version: 0.10.5.1

Additional context
This was raised a few months ago: https://datahubspace.slack.com/archives/C029A3M079U/p1669024089697259?thread_ts=1668698117.723649&cid=C029A3M079U

@hsheth2
Copy link
Collaborator

hsheth2 commented Aug 3, 2023

@travis-cook-sfdc thanks for the report. I have a PR up that will pin the thrift version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants