-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix running examples readme #11225
Fix running examples readme #11225
Conversation
Here is one idea I had -- to move the examples to the user guide: #11217. This ensures that the examples run, However, I don't think this really solves the problem you are after Maybe we could look to make them as self contained as possible (perhaps provide a self contained example of querying from record batches in memory) 🤔 As a relatively new user of DataFusion what would you find most helpful? |
that's a good idea.
Improving the docs is definitely a good idea and having examples in docs tested is so nice.
As a user, i would prefer to have working examples within docs. |
This division makes sense. Maybe we can move the "simple" examples to the user guide (e.g. that are less than 1 page long or something) and then we can leave the more complex eamples as datafusion-examples 🤔 For things like csv reading and dataframe aggregation, once you have the code in the IDE, perhaps those examples would be better directly in the doc as run doc examples |
Might be. I think it's good reasoning to table #11219 for now, I closed that PR. Thanks for feedback! What do you think about this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion @findepi -- I am sorry it took so long to review this PR. I wanted to verify the commands locally.
I tried the existing commands and they seemed to work -- so maybe I am missing something.
@@ -19,7 +19,6 @@ | |||
|
|||
set -ex | |||
cd datafusion-examples/examples/ | |||
cargo fmt --all -- --check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove this? I think it is intended to ensure that cargo fmt is run on the examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, and i see how this is not obvious. That's why it's separate commit with an explanation:
Remove doubled fmt check from CI
Examples' format is checked in `check-fmt` job, so can be skipped in
`rust_example.sh`.
@@ -35,6 +35,9 @@ cd datafusion | |||
# Download test data | |||
git submodule update --init | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this locally and the existing command (in the main datafusion directory) also seems to work fine
~/Software/datafusion2$ cargo run --example csv_sql
Running `target/debug/examples/csv_sql`
+----+-----------------------------+-----------------------------+
| c1 | MIN(aggregate_test_100.c12) | MAX(aggregate_test_100.c12) |
+----+-----------------------------+-----------------------------+
| c | 0.2667177795079635 | 0.991517828651004 |
| a | 0.02182578039211991 | 0.9567595541247681 |
| b | 0.04893135681998029 | 0.9185813970744787 |
| e | 0.01479305307777301 | 0.9965400387585364 |
| d | 0.061029375346466685 | 0.9748360509016578 |
+----+-----------------------------+-----------------------------+
+----+------+
| c2 | c3 |
+----+------+
| 1 | -85 |
| 3 | 13 |
| 4 | -38 |
| 4 | -54 |
| 5 | 36 |
| 1 | -25 |
| 5 | -31 |
| 2 | 45 |
| 3 | 13 |
| 3 | 17 |
| 4 | 65 |
| 4 | -101 |
| 2 | -48 |
| 1 | -56 |
| 1 | -5 |
| 3 | 14 |
| 1 | 83 |
| 3 | -12 |
| 3 | -72 |
| 2 | -43 |
| 5 | -101 |
+----+------+
It does also work in datafusion-examples/examples
/ but cd
ing into that directory seems unecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @alamb for testing this.
let me quote the commit message, maybe it explains my reasoning better:
...
Some examples are runnable from any place (e.g. `csv_sql`), but some
expect a specific working directory (e.g. `regexp`). Running from
`datafusion-examples/examples` is tested on CI so guaranteed to work,
let's put this path in the README.
...
It does also work in datafusion-examples/examples/ but cding into that directory seems unecessary
i agree.
It happened to be necessary for the first example i tried to run. maybe i was just unlucky
per the commit message
...
As a follow-up, we should look what it would take to make examples
runnable directly from an IDE such as RustRover.
-- I also am against cd-ing into directory as unnecessary and also want the examples to be runnable from an IDE.
If we want this to be the case, this validated by the CI scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvement and the explanations @findepi
I think this PR makes the code better so let's merge it in and keep iterating to make things better 🙏
There is no conflicted files though, hope rebase solves it @findepi |
Some examples are runnable from any place (e.g. `csv_sql`), but some expect a specific working directory (e.g. `regexp`). Running from `datafusion-examples/examples` is tested on CI so guaranteed to work, let's put this path in the README. As a follow-up, we should look what it would take to make examples runnable directly from an IDE such as RustRover.
Examples' format is checked in `check-fmt` job, so can be skipped in `rust_example.sh`.
7e5a8d0
to
5ddec76
Compare
Thanks again @findepi |
* Fix running examples readme Some examples are runnable from any place (e.g. `csv_sql`), but some expect a specific working directory (e.g. `regexp`). Running from `datafusion-examples/examples` is tested on CI so guaranteed to work, let's put this path in the README. As a follow-up, we should look what it would take to make examples runnable directly from an IDE such as RustRover. * Remove doubled fmt check from CI Examples' format is checked in `check-fmt` job, so can be skipped in `rust_example.sh`.
Some examples are runnable from any place (e.g.
csv_sql
), but some expect a specific working directory (e.g.regexp
). Running fromdatafusion-examples/examples
is tested on CI so guaranteed to work, let's put this path in the README.As a follow-up, we should look what it would take to make examples runnable directly from an IDE such as RustRover.