-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement to_pandas()
#197
Conversation
28be584
to
61b5da6
Compare
This is looking good so far. Thanks @simicd |
Can you also update the documentation? -# collect as list of pyarrow.RecordBatch
-results = df.collect()
-# get first batch
-batch = results[0]
-# convert to Pandas
-df = batch.to_pandas()
# collect as pandas
df = df.to_pandas() |
61b5da6
to
8276725
Compare
1251f44
to
238050e
Compare
Thanks for already looking into the PR @andygrove, implemented your feedback and set the PR as ready for final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @simicd and @krzysztof-kwitt.
I wonder if this method will still work for empty result - 0 rows/batches. @simicd What do you think? |
* changelog (#188) * Add Python wrapper for LogicalPlan::Sort (#196) * Add Python wrapper for LogicalPlan::Aggregate (#195) * Add Python wrapper for LogicalPlan::Limit (#193) * Add Python wrapper for LogicalPlan::Filter (#192) * Add Python wrapper for LogicalPlan::Filter * clippy * clippy * Update src/expr/filter.rs Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com> --------- Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com> * Add tests for recently added functionality (#199) * Add experimental support for executing SQL with Polars and Pandas (#190) * Run `maturin develop` instead of `cargo build` in verification script (#200) * Implement `to_pandas()` (#197) * Implement to_pandas() * Update documentation * Write unit test * Add support for cudf as a physical execution engine (#205) * Update README in preparation for 0.8 release (#206) * Analyze table bindings (#204) * method for getting the internal LogicalPlan instance * Add explain plan method * Add bindings for analyze table * Add to_variant * cargo fmt * blake and flake formatting * changelog (#209) --------- Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com> Co-authored-by: Dejan Simic <10134699+simicd@users.noreply.github.com> Co-authored-by: Jeremy Dyer <jdye64@gmail.com>
@krzysztof-kwitt Good catch! Indeed, it fails with an error - I opened #234 to track the issue |
Which issue does this PR close?
Closes #139.
Rationale for this change
Convert datafusion dataframe directly to a pandas dataframe
What changes are included in this PR?
Implement
to_pandas()
method using pyarrow libraryAre there any user-facing changes?
New
to_pandas()
method