-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting started guide for new users (who want to use DataFusion in their project) #7014
Comments
If someone wanted to help out the DataFusion project helping with this one would be awesome. A good first step would be to make the skeleton of the topics above in https://github.com/apache/arrow-datafusion/tree/main/docs and leave placeholder text (like "Coming Soon") Then we can work together on writing the content in a few different PRs |
This sounds great, really excited! We'll either want two user guides or one user guide that's half in Python / half in Rust. I guess that 99% of the users that want to query data via an API will want to do so in SQL / Python. The Python DataFrame user guide is way more important than the Rust one. Users leveraging DataFusion to build tools for other engines (e.g. delta-rs) are much more likely to be using Rust. Perhaps we divide the documentation as follows:
I don't think we should invest in building out the DataFusion Rust DataFrame API docs yet because it's a lower ROI activity. We should build a URL structure that allows for this however. |
I agree this is more important for "end users" rather than developers who are building with Rust
That sounds great -- I filed apache/datafusion-python#432 to track the work for the python bindings |
Is your feature request related to a problem or challenge?
If we want to have DataFusion used as the core of many new systems, we need it to be as easy as possible for someone to get their idea working on top of DataFusion.
The current user guide I think helps setup the basics of the project and get a "hello world" style program going but then kind of leave the reader in a "now what" type situation: https://arrow.apache.org/datafusion/user-guide/example-usage.html
Describe the solution you'd like
I would like a document, perhaps similar in style to the polars user guide: https://pola-rs.github.io/polars-book/user-guide/
Basically I am thinking of something that would have helped @bubbajoe get up to speed
The examples directory holds a bunch of examples: https://github.com/apache/arrow-datafusion/tree/main/datafusion-examples
Potential outline:
TableProviders
(in Add library guide for table provider and catalog providers #7287)ExecutionPlans
#7308Describe alternatives you've considered
No response
Additional context
This idea was suggested by @MrPowers
The text was updated successfully, but these errors were encountered: