You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do. datafusion-python currently errors when calling select count(*) from t when t is a pyarrow.Dataset.
The resulting pyarrow.RecordBatch contains no rows and has a schema with no columns, but it does have num_rows set to the correct number.
Describe the solution you'd like
Support was added to arrow-rs in #1552 for a RecordBatch with zero columns but non zero row count.
I'd like impl FromPyArrow for RecordBatch to use this functionality.
…dBatchOptions when converting a pyarrow RecordBatch) (#6320)
* use RecordBatchOptions when converting a pyarrow RecordBatch
Ref: #6318
* add assertion that num_rows persists through the round trip
* add implementation comment
* nicer creation of empty recordbatch in test_empty_recordbatch_with_row_count
* use len provided by pycapsule interface when available
* update test comment
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
datafusion-python
currently errors when callingselect count(*) from t
whent
is apyarrow.Dataset
.The resulting
pyarrow.RecordBatch
contains no rows and has a schema with no columns, but it does havenum_rows
set to the correct number.Describe the solution you'd like
Support was added to arrow-rs in #1552 for a
RecordBatch
with zero columns but non zero row count.I'd like
impl FromPyArrow for RecordBatch
to use this functionality.arrow-rs/arrow/src/pyarrow.rs
Lines 334 to 392 in b711f23
Additional Context
datafusion-python issue: apache/datafusion-python#800
The text was updated successfully, but these errors were encountered: