-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] RecordBatchReader constructor from stream object implementing the PyCapsule Protocol #39217
Closed
Comments
Closed
1 task
jorisvandenbossche
added a commit
to jorisvandenbossche/arrow
that referenced
this issue
Dec 13, 2023
…r objects implementing the Arrow PyCapsule protocol
jorisvandenbossche
added a commit
that referenced
this issue
Jan 8, 2024
…cts implementing the Arrow PyCapsule protocol (#39218) ### Rationale for this change In contrast to Array, RecordBatch and Schema, for the C Stream (mapping to RecordBatchReader) we haven't an equivalent factory function that can accept any Arrow-compatible object and turn it into a pyarrow object through the PyCapsule Protocol. For that reason, this proposes an explicit constructor class method for this: `RecordBatchReader.from_stream` (this is a quite generic name, so other name suggestions are certainly welcome). ### Are these changes tested? TODO * Closes: #39217 Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
8 tasks
clayburn
pushed a commit
to clayburn/arrow
that referenced
this issue
Jan 23, 2024
…r objects implementing the Arrow PyCapsule protocol (apache#39218) ### Rationale for this change In contrast to Array, RecordBatch and Schema, for the C Stream (mapping to RecordBatchReader) we haven't an equivalent factory function that can accept any Arrow-compatible object and turn it into a pyarrow object through the PyCapsule Protocol. For that reason, this proposes an explicit constructor class method for this: `RecordBatchReader.from_stream` (this is a quite generic name, so other name suggestions are certainly welcome). ### Are these changes tested? TODO * Closes: apache#39217 Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
dgreiss
pushed a commit
to dgreiss/arrow
that referenced
this issue
Feb 19, 2024
…r objects implementing the Arrow PyCapsule protocol (apache#39218) ### Rationale for this change In contrast to Array, RecordBatch and Schema, for the C Stream (mapping to RecordBatchReader) we haven't an equivalent factory function that can accept any Arrow-compatible object and turn it into a pyarrow object through the PyCapsule Protocol. For that reason, this proposes an explicit constructor class method for this: `RecordBatchReader.from_stream` (this is a quite generic name, so other name suggestions are certainly welcome). ### Are these changes tested? TODO * Closes: apache#39217 Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
zanmato1984
pushed a commit
to zanmato1984/arrow
that referenced
this issue
Feb 28, 2024
…r objects implementing the Arrow PyCapsule protocol (apache#39218) ### Rationale for this change In contrast to Array, RecordBatch and Schema, for the C Stream (mapping to RecordBatchReader) we haven't an equivalent factory function that can accept any Arrow-compatible object and turn it into a pyarrow object through the PyCapsule Protocol. For that reason, this proposes an explicit constructor class method for this: `RecordBatchReader.from_stream` (this is a quite generic name, so other name suggestions are certainly welcome). ### Are these changes tested? TODO * Closes: apache#39217 Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
ion-elgreco
added a commit
to delta-io/delta-rs
that referenced
this issue
Jul 18, 2024
…2534) # Description Adds support for the [Arrow PyCapsule interface](https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html). Since pyarrow is already a required dependency, this takes the minimal route of converting pycapsule interface objects into pyarrow objects. This requires pyarrow 15 or higher for the stream conversion (apache/arrow#39217). This doesn't modify the existing hard-coded support for pyarrow and pandas # Related Issue(s) - closes #2376 # Documentation --------- Co-authored-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In #37797 we added the dunder methods for the Arrow PyCapsule Protocol, and we also already added support for checking for objects that implement the protocol in the
pa.array(..)
,pa.record_batch(..)
andpa.schema(..)
constructors, such that you can for example create a pyarrow array withpa.array(obj)
given any objectobj
that supports the interface by defining__arrow_c_array__
.But for the stream objects, we don't have an equivalent factory function that creates a RecordBatchReader. Therefore I think it would be good to add a public RecordBatchReader constructor from stream objects implementing the protocol (to avoid you need to call the
_import_from_c_capsule
private method for this use case). For exampleRecordBatchReader.from_stream
?The text was updated successfully, but these errors were encountered: