Skip to content
This repository has been archived by the owner on Jul 25, 2022. It is now read-only.

Add support for Ballista #37

Closed
andygrove opened this issue Mar 10, 2022 · 8 comments
Closed

Add support for Ballista #37

andygrove opened this issue Mar 10, 2022 · 8 comments
Assignees

Comments

@andygrove
Copy link
Contributor

I would like to be able to execute queries against Ballista from Python.

I think this is just a case of adding a new PyBallistaContext class.

This should be an optional feature, disabled by default.

@andygrove andygrove self-assigned this Mar 10, 2022
@andygrove
Copy link
Contributor Author

@jimexist Does it make sense to add Ballista support here or should we have a separate ballista-python repo that somehow re-uses parts of datafusion-python ?

@matthewmturner
Copy link
Contributor

In my mind it makes sense to make a separate ballista-python as I view datafusion-python to be it's own standalone system just as ballista is. However I acknowledge there may be significant overlap. That being said there's been a lot of work lately on ballista which will likely continue so it could be a good time to decouple them for the purpose of Python bindings.

@andygrove
Copy link
Contributor Author

Thanks for the input ... I will close this issue and start a new repo and copy and paste much from this repo for now

@nl5887
Copy link
Contributor

nl5887 commented Jun 4, 2022

@andygrove this is my take at the ballista-python crate, any suggestions? https://github.com/nl5887/ballista-python

@andygrove
Copy link
Contributor Author

Hi @nl5887 thanks for working on this! I think you could PR this directly into the arrow-ballista repo into a top-level Python folder. I would love to try this out but I am not a Python expert so may need some help. I followed the instructions in "How to develop" and it looks like everything installed but I am not sure how to import the project in the Python repl.

 python
Python 3.9.5 (default, Jun  4 2021, 12:28:51) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> import ballista
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'ballista'

>>> import datafusion
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/andy/git/personal/ballista-python/datafusion/__init__.py", line 29, in <module>
    from ._internal import (
ModuleNotFoundError: No module named 'datafusion._internal'

@nl5887
Copy link
Contributor

nl5887 commented Jun 4, 2022

Could be that I still had some left-overs from the old python datafusion project. I'll rename everything to ballista, make sure both datafusion and ballista modules can co-exist and do some additional cleanup. When ready will make a PR for arrow-ballista repo. Thanks!

@nl5887
Copy link
Contributor

nl5887 commented Jun 4, 2022

@andygrove just pushed latest code to my repo (https://github.com/nl5887/ballista-python). This should work for you.

@andygrove
Copy link
Contributor Author

@nl5887 I just tried it and it works beautifully ❤️

I had also missed a step in my previous attempt ... I can't wait to see this in Ballista! Thanks so much for working on this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants