Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update main DataFusion README #4903

Merged
merged 9 commits into from
Jan 17, 2023
Merged

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jan 13, 2023

Which issue does this PR close?

re #3058 and #1814

Rationale for this change

I attended a conference last week CIDR and it was clear to me that DataFusion is more widely applicable than it is currently used, and part of that is lack of awareness in the broader community. Thus I wanted to try and communicate what DataFusion was / was good for more clearly.

Also as we have gathered more users we can now point to some more concrete examples

I also hope/plan to try and writeup a "datafusion architecture guide" soon as an additional way to encourage / grow our user base by making it more accessable for new people to see what we have (which is a lot!). Related to #980 from @xudong963

What changes are included in this PR?

  1. Update the main readme with: more specific information about what datafusion is, and what it has been used for
  2. Added some compare/contrast with pola.rs, DuckDB and Velox.

See the rendered page here: https://github.com/alamb/arrow-datafusion/tree/alamb/improve_user_docs#datafusion

Discussion

Dicsussion:
How much of this content should go in the user guide https://arrow.apache.org/datafusion?

I like the idea of all the non-release specific content being in the user guide . However, the website isn't updated all that often (as it takes a manual process as I recall)

The README is what shows up on the main crates.io page: https://crates.io/crates/datafusion as well as the landing page of github.

@alamb alamb requested a review from andygrove January 13, 2023 21:14
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jan 13, 2023
@alamb
Copy link
Contributor Author

alamb commented Jan 13, 2023

@xudong963 or @Dandandan I wonder if you have any thoughts on this content?

@ozankabak
Copy link
Contributor

This is a much better than what we currently have right now, thank you for writing this up.

I think having this stuff in the README still makes sense at this time since the website is not updated often (and we don't have a habit/rule for it AFAICT).

To fuel community growth, I think we should avoid all kinds of friction that inhibit advertising/publishing what we have (and what we are working on). Since the README file is easy to update, and it is a common gateway page for many new users, I think it is OK to keep it "rich" during this phase of the project lifecycle.

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@alamb alamb merged commit b756d05 into apache:master Jan 17, 2023
@alamb alamb deleted the alamb/improve_user_docs branch January 17, 2023 21:03
@ursabot
Copy link

ursabot commented Jan 17, 2023

Benchmark runs are scheduled for baseline = aa8f139 and contender = b756d05. b756d05 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants