feat(python): Add array creation/building from buffers #378

paleolimbot · 2024-01-31T21:27:05Z

The gist of this PR is that I'd like the ability to create arrays for testing without pyarrow so that nanoarrow's tests can run in more places. Other than building/running in odd corner-case environments, nanoarrow in R has been great at prototyping and/or creating test data (e.g., an array with a non-zero offset, an array with a rarely-used type). This is useful for both nanoarrow to test itself and perhaps others who might want to use nanoarrow in a similar way in Python.

This is a bit big...I did need to put all of it in one place to figure out what the end point was; however, I'm happy to split into smaller self-contained bits now that I know where I'm headed.

After this PR, we can create an array out-of-the-box from anything that supports the buffer protocol. Importantly, this includes numpy arrays so that you can do things like generate arrays with n random numbers.

import nanoarrow as na
import numpy as np

na.c_array_view(b"12345")

<nanoarrow.c_lib.CArrayView>
- storage_type: 'uint8'
- length: 5
- offset: 0
- null_count: 0
- buffers[2]:
  - validity <bool[0 b] >
  - data <uint8[5 b] 49 50 51 52 53>
- dictionary: NULL
- children[0]:

na.c_array_view(np.array([1, 2, 3], np.int32))

<nanoarrow.c_lib.CArrayView>
- storage_type: 'int32'
- length: 3
- offset: 0
- null_count: 0
- buffers[2]:
  - validity <bool[0 b] >
  - data <int32[12 b] 1 2 3>
- dictionary: NULL
- children[0]:

While not built in to the main c_array() constructor, we can also now assemble an array from buffers. This has been very useful in R and ensures that we can construct just about any array if we need to.

array = na.c_array_from_buffers(
    na.struct([na.int32()]),
    length=3,
    buffers=[None],
    children=[
        na.c_array_from_buffers(
            na.int32(),
            length=3,
            buffers=[None, na.c_buffer([1, 2, 3], na.int32())]
        )
    ],
)

na.c_array_view(array)

<nanoarrow.c_lib.CArrayView>
- storage_type: 'struct'
- length: 3
- offset: 0
- null_count: 0
- buffers[1]:
  - validity <bool[0 b] >
- dictionary: NULL
- children[1]:
  - <nanoarrow.c_lib.CArrayView>
    - storage_type: 'int32'
    - length: 3
    - offset: 0
    - null_count: 0
    - buffers[2]:
      - validity <bool[0 b] >
      - data <int32[12 b] 1 2 3>
    - dictionary: NULL
    - children[0]:

I also added the ability to construct a buffer from an iterable and wired that into the c_array() constructor although this is probably not all that fast. It does, however, make it much easier to write tests (because many of them currently start with na_c_array(pa.array([1, 2, 3])).

na.c_array_view([1, 2, 3], na.int32())

<nanoarrow.c_lib.CArrayView>
- storage_type: 'int32'
- length: 3
- offset: 0
- null_count: 0
- buffers[2]:
  - validity <bool[0 b] >
  - data <int32[12 b] 1 2 3>
- dictionary: NULL
- children[0]:

This allows creating an array from anything supported by the struct module which means we can create some of the less frequently used types.

na.c_array_view([1, 2, 3], na.float16())

CBuffer(half_float[6 b] 1.0 2.0 3.0)

na.c_array_view([(1, 2), (3, 4), (5, 6)], na.interval_day_time())

CBuffer(interval_day_time[24 b] (1, 2) (3, 4) (5, 6))

Because it's mentaly exhausting to bitpack buffers in my head and because Arrow uses them all the time, I also think it's mission-critical to be able to create bitmaps:

na.c_buffer([True, False, True, True], na.bool())

CBuffer(bool[1 b] 10110000)

This involved fixing some issues with the existing buffer view:

The buffer view only ever saved a pointer to the device. This is a bit of a problem because even though the CPU device is static and lives forever, CUDA "device" objects will probably keep a CUDA context alive. Thus, we need a strong reference to the CDevice Python object (which ensures the underlying nanoarrow Device* remains valid).
The buffer view only handled BufferView input where technically all it needs is a pointer and a length. This opens it up to represent other types of buffers than just something from nanoarrow (e.g., imported from dlpack or buffer protocol).

Implementing the buffer protocol as a consumer was done by wrapping the ArrowBuffer with a "deallocator" that holds the Py_buffer and ensures it is released. I still need to do some testing to ensure that it's actually released and that we're not leaking memory. This is how I do it in R and in geoarrow-c (Python) as well. Using the ArrowBuffer is helpful because the C-level array builder uses them to manage the memory and ensures they're all released when the array is released.

Implementing the build-from-iterable involved a few more things...notably, completing the "python struct format string" <-> "arrow data type" conversion. This allows the use of struct.pack() which takes care of things like half-float conversion and tuples of day, month, nano conversion.

I'm aware this could use a bit better documentation of the added classes/methods...I am assuming these will be internal for the time being but they definitely need a bit more than is currently there.

jorisvandenbossche

This first set of changes look good!

jorisvandenbossche · 2024-02-01T12:11:47Z

python/src/nanoarrow/_lib.pyx

 cdef Py_ssize_t _element_size_bits
 cdef Py_ssize_t _shape
 cdef Py_ssize_t _strides
 cdef char _format[128]

- def __cinit__(self, object base, uintptr_t addr,
+ def __cinit__(self, object base, uintptr_t addr, int64_t size_bytes,
 ArrowBufferType buffer_type, ArrowType buffer_data_type,


Not related to this PR, just noticing while reviewing: should the buffer type be part of the buffer? That doesn't necessarily make sense for a buffer itself, only in context of a buffer as part of an array. But if you create a standalone buffer, then it doesn't make sense?

It also seems we currently don't use this attribute, except for the public type property

(I know the same is true for the data type, but at least that one you need to support exporting the buffer to an array / the buffer protocol)

Good call! It's used in the arrayview repr (so I moved it to the ArrayView where it's more relevant).

jorisvandenbossche · 2024-02-01T12:18:35Z

python/README.md

+If you need to debug or edit any .c files used to build the native extension, you can generate the `compile_commands.json` usable by most IDEs/clang tools using [Bear](https://github.com/rizsotto/Bear).
+
+```shell
+bear -- python setup.py build_ext --inplace && mv compile_commands.json build
+```


BTW, this is a bit cryptic to me what this does / can help you with, but I am also not familiar with the tool or the compile_commands

I'll leave it out for now since it's probably just me who has to remember that command. If build/compile_commands.json is present, most editors "just work" with the .c/.cc./.h files without additional configuration.

codecov-commenter · 2024-02-03T03:22:37Z

Codecov Report

Attention: 37 lines in your changes are missing coverage. Please review.

Comparison is base (dab87aa) 87.89% compared to head (a94f7fd) 88.23%.
Report is 7 commits behind head on main.

Files	Patch %	Lines
python/src/nanoarrow/_lib.pyx	92.94%	35 Missing ⚠️
python/src/nanoarrow/c_lib.py	98.30%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #378      +/-   ##
==========================================
+ Coverage   87.89%   88.23%   +0.34%     
==========================================
  Files          76       76              
  Lines       13371    13840     +469     
==========================================
+ Hits        11752    12212     +460     
- Misses       1619     1628       +9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jorisvandenbossche · 2024-02-15T10:24:36Z

python/src/nanoarrow/_lib.pyx

+ @property
+ def element_size_bits(self):
+ return self._element_size_bits
+
+ @property
+ def item_size(self):
+ self._refresh_view_if_needed()
+ return self._view.item_size


The difference between both is bits vs bytes? (if so, we might want to use a more consistent naming scheme? item vs element)

I updated the CBufferView, the CBuffer, and the CBufferBuilder to all use item(s) for the definition of item that is used by the buffer protocol/memoryview, and element to support the definition we use, which is basically the same except supports bitmaps. I also added in a way to extract bitmaps since we'll need them shortly.

I see. Can you add some docstrings to clarify that difference between "item_.." and "element_.."?

python/src/nanoarrow/c_lib.py

jorisvandenbossche · 2024-02-15T10:31:10Z

python/src/nanoarrow/c_lib.py

+ # Try import of bare capsule
+ if _obj_is_capsule(obj, "arrow_array"):
+ if requested_schema is None:
+ requested_schema_capsule = CSchema.allocate()._capsule


How does the import work if you don't have an actual schema? (I suppose the above allocates an "emtpy" ArrowSchema?)

Yes, it will leave you with an invalid schema. Nothing preventing the print method from working, though! (Or from using ._addr to safely pass it somewhere else).

import nanoarrow as na import pyarrow as pa na.c_array(pa.array([1, 2, 3]).__arrow_c_array__()[1]) # <nanoarrow.c_lib.CArray [invalid: schema is released]> # - length: 3 # - offset: 0 # - null_count: 0 # - buffers: (0, 6021410128064) # - dictionary: NULL # - children[0]:

python/src/nanoarrow/c_lib.py

jorisvandenbossche · 2024-02-15T10:47:07Z

python/src/nanoarrow/_lib.pyx

+ c_buffer[0].data = <uint8_t*>buffer.buf
+ c_buffer[0].size_bytes = <int64_t>buffer.len
+ c_buffer[0].capacity_bytes = 0
+ c_buffer[0].allocator = c_pybuffer_deallocator(&buffer)


Ah, I see, you need to call PyBuffer_Release, and I assume python will ensure it increases the ref on the object / keeps that alive, until that release is called (so essentially the same as with the C Data Interface release callback)

jorisvandenbossche

Another batch of comments ;)

Most are minor details, but you will notice that the bulk of the comments has to do with CBuffer vs CBufferBuilder, and CBuffer being a "mutable" class with those set_.. methods. My initial reaction was that it might be cleaner to separate those two more clearly (eg CBuffer doesn't need those set_ methods, and the builder can maybe not subclass CBuffer / CBufferBuilder.finish can return a CBuffer instead of itself).

(not having those set_.. methods on CBuffer might also avoid the need for the _assert_buffer_count_zero / _refresh_view_if_needed checks?)

Although while writing this now, I also realize that in Arrow C ++ we have the concept of a Buffer vs Mutable/ResizableBuffer as well.

python/src/nanoarrow/c_lib.py

jorisvandenbossche · 2024-02-15T11:01:01Z

python/src/nanoarrow/_lib.pyx

+ self._reset_view()
+ return self
+
+ def set_pybuffer(self, obj):


Instead of having this "mutable" interface, essentially this function just creates a CBuffer from a python buffer, right? So it could also be a class method, and instead of using CBuffer().set_pybuffer(obj) one would do CBuffer.from_pybuffer(obj) ?

What I think I find confusing about those set_ methods, is that it makes CBuffer look like a builder class, while there is a separate BufferBuilder class?

I removed them! The CBuffer should now be immutable (except in a few limited internal cases).

python/tests/test_c_array.py

jorisvandenbossche · 2024-02-15T11:18:16Z

python/src/nanoarrow/c_lib.py

+ # Set buffers. This moves ownership of the buffers as well (i.e., the objects
+ # in the input buffers are replaced with an empty ArrowBuffer)


This was a bit confusing (I ran into this while trying out the method):

In [74]: buf = na.c_buffer(b"0123") In [75]: na.c_array_from_buffers(na.int32(), 1, [None, buf]) Out[75]: <nanoarrow.c_lib.CArray int32> - length: 1 - offset: 0 - null_count: 0 - buffers: (0, 139688822562528) - dictionary: NULL - children[0]: In [76]: na.c_array_from_buffers(na.int32(), 1, [None, buf]) --------------------------------------------------------------------------- NanoarrowException Traceback (most recent call last) Cell In [76], line 1 ----> 1 na.c_array_from_buffers(na.int32(), 1, [None, buf]) File ~/scipy/repos/arrow-nanoarrow/python/src/nanoarrow/c_lib.py:287, in c_array_from_buffers(schema, length, buffers, null_count, offset, children, validation_level) 284 builder.resolve_null_count() 286 # Validate + finish --> 287 return builder.finish(validation_level=validation_level) File src/nanoarrow/_lib.pyx:1818, in nanoarrow._lib.CArrayBuilder.finish() File src/nanoarrow/_lib.pyx:422, in nanoarrow._lib.Error.raise_message_not_ok() File src/nanoarrow/_lib.pyx:417, in nanoarrow._lib.Error.raise_message() NanoarrowException: ArrowArrayFinishBuildingDefault() failed (22): Expected int32 array buffer 1 to have size >= 4 bytes but found buffer with 0 bytes In [77]: buf Out[77]: CBuffer(uint8[0 b] )

When passing python object through buffer protocol, the original object keeps owning the buffer content. We can't do something similar for CBuffer? (and that's because the CArray doesn't actually keep track of CBuffer objects, but only the C ArrowArray that has a pointer to the buffer?)

And where does this actually happen? I don't see anything in CArrayBuilder.set_buffer that would invalidate the buffer?

And where does this actually happen? I don't see anything in CArrayBuilder.set_buffer that would invalidate the buffer?

Ah, I suppose it is the ArrowBufferMove in the C code that does this?

I made the ArrowBufferMove() explicit (and made the "move + invalidate previous") behaviour opt-in, since it's definitely confusing if you did not expect it to happen).

python/src/nanoarrow/_lib.pyx

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

jorisvandenbossche · 2024-02-15T16:57:45Z

python/src/nanoarrow/_lib.pyx

- if requested_schema is not None:
- raise NotImplementedError("requested_schema")


Is there a reason this was removed? (because it's still not supported I think, the keyword is just ignored otherwise)

I'll add it back...I had thought requesting a schema was best-effort and not cast-or-error but I just re-read the final language.

If you're a consumer passing a requested_schema through the protocol, it would be a very bad idea not to check that the schema you received was the one you requested (because if the producer got it wrong, you will crash). For producers like this one that don't support casting, it forces that check to happen twice (if you want to support the case where schema is already the correct type). If you can do casting but only support some of the conversions (e.g., you can return stringviews but not listviews), the current language of the spec forces you to error.

(I know, a discussion for another place).

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

jorisvandenbossche · 2024-02-16T09:43:04Z

python/src/nanoarrow/_lib.pyx

+ self._ptr.null_count = self._ptr.length - count
+ return self
+
+ def set_buffer(self, int64_t i, CBuffer buffer, move=False):


Can you add a bit of comment/docstring what this move=False vs True exactly does?

I see below that the docstring of the public function c_array_from_buffers has an explanation: "Use True to move ownership of any input buffers or children to the output array.".

I would repeat that here for when looking at the code here. And it's also not super clear to me what the consequence is exactly of doing that? Or, why would you want to set it to True?

I added a docstring and also set it to True in the appropriate places in c_lib. It's to avoid the situation where you have an ArrowBuffer wrapping a Py_buffer wrapping an ArrowBuffer wrapping a Py_buffer (which is what would happened before the last commit if move=False and buffers=(None, <some numpy array>)).

Thanks, that expanded docstring explanation helps a lot!

jorisvandenbossche · 2024-02-16T09:43:43Z

python/src/nanoarrow/_lib.pyx

+ # Flush the buffer address from the buffer into the ArrowArray struct
+ self._ptr.buffers[i] = ArrowArrayBuffer(self._ptr, i).data


What does this exactly do?

If you have ArrowArray array, array->buffers is a simple array of const void* and doesn't know how to find the ArrowBuffer's data member. ArrowArraySetBuffer() and ArrowArrayFinishBuilding() both take care of this bit of synchronization before the array is ready for outside consumption.

The gist of this PR is that I'd like the ability to create arrays for testing without pyarrow so that nanoarrow's tests can run in more places. Other than building/running in odd corner-case environments, nanoarrow in R has been great at prototyping and/or creating test data (e.g., an array with a non-zero offset, an array with a rarely-used type). This is useful for both nanoarrow to test itself and perhaps others who might want to use nanoarrow in a similar way in Python. This is a bit big...I did need to put all of it in one place to figure out what the end point was; however, I'm happy to split into smaller self-contained bits now that I know where I'm headed. After this PR, we can create an array out-of-the-box from anything that supports the buffer protocol. Importantly, this includes numpy arrays so that you can do things like generate arrays with `n` random numbers. ```python import nanoarrow as na import numpy as np ``` ```python na.c_array_view(b"12345") ``` <nanoarrow.c_lib.CArrayView> - storage_type: 'uint8' - length: 5 - offset: 0 - null_count: 0 - buffers[2]: - validity <bool[0 b] > - data <uint8[5 b] 49 50 51 52 53> - dictionary: NULL - children[0]: ```python na.c_array_view(np.array([1, 2, 3], np.int32)) ``` ``` <nanoarrow.c_lib.CArrayView> - storage_type: 'int32' - length: 3 - offset: 0 - null_count: 0 - buffers[2]: - validity <bool[0 b] > - data <int32[12 b] 1 2 3> - dictionary: NULL - children[0]: ``` While not built in to the main `c_array()` constructor, we can also now assemble an array from buffers. This has been very useful in R and ensures that we can construct just about any array if we need to. ```python array = na.c_array_from_buffers( na.struct([na.int32()]), length=3, buffers=[None], children=[ na.c_array_from_buffers( na.int32(), length=3, buffers=[None, na.c_buffer([1, 2, 3], na.int32())] ) ], ) na.c_array_view(array) ``` <nanoarrow.c_lib.CArrayView> - storage_type: 'struct' - length: 3 - offset: 0 - null_count: 0 - buffers[1]: - validity <bool[0 b] > - dictionary: NULL - children[1]: - <nanoarrow.c_lib.CArrayView> - storage_type: 'int32' - length: 3 - offset: 0 - null_count: 0 - buffers[2]: - validity <bool[0 b] > - data <int32[12 b] 1 2 3> - dictionary: NULL - children[0]: I also added the ability to construct a buffer from an iterable and wired that into the `c_array()` constructor although this is probably not all that fast. It does, however, make it much easier to write tests (because many of them currently start with `na_c_array(pa.array([1, 2, 3]))`. ```python na.c_array_view([1, 2, 3], na.int32()) ``` <nanoarrow.c_lib.CArrayView> - storage_type: 'int32' - length: 3 - offset: 0 - null_count: 0 - buffers[2]: - validity <bool[0 b] > - data <int32[12 b] 1 2 3> - dictionary: NULL - children[0]: This allows creating an array from anything supported by the `struct` module which means we can create some of the less frequently used types. ```python na.c_array_view([1, 2, 3], na.float16()) ``` CBuffer(half_float[6 b] 1.0 2.0 3.0) ```python na.c_array_view([(1, 2), (3, 4), (5, 6)], na.interval_day_time()) ``` CBuffer(interval_day_time[24 b] (1, 2) (3, 4) (5, 6)) Because it's mentaly exhausting to bitpack buffers in my head and because Arrow uses them all the time, I also think it's mission-critical to be able to create bitmaps: ```python na.c_buffer([True, False, True, True], na.bool()) ``` CBuffer(bool[1 b] 10110000) This involved fixing some issues with the existing buffer view: - The buffer view only ever saved a pointer to the device. This is a bit of a problem because even though the CPU device is static and lives forever, CUDA "device" objects will probably keep a CUDA context alive. Thus, we need a strong reference to the `CDevice` Python object (which ensures the underlying nanoarrow `Device*` remains valid). - The buffer view only handled `BufferView` input where technically all it needs is a pointer and a length. This opens it up to represent other types of buffers than just something from nanoarrow (e.g., imported from dlpack or buffer protocol). Implementing the buffer protocol as a consumer was done by wrapping the `ArrowBuffer` with a "deallocator" that holds the `Py_buffer` and ensures it is released. I still need to do some testing to ensure that it's actually released and that we're not leaking memory. This is how I do it in R and in geoarrow-c (Python) as well. Using the `ArrowBuffer` is helpful because the C-level array builder uses them to manage the memory and ensures they're all released when the array is released. Implementing the build-from-iterable involved a few more things...notably, completing the "python struct format string" <-> "arrow data type" conversion. This allows the use of `struct.pack()` which takes care of things like half-float conversion and tuples of day, month, nano conversion. I'm aware this could use a bit better documentation of the added classes/methods...I am assuming these will be internal for the time being but they definitely need a bit more than is currently there. --------- Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

paleolimbot added 4 commits January 31, 2024 16:15

re-render readme

622465d

tweak the Device Python wrapper

3d36f97

use memoryview's iterator where possible

23480d2

more flexible buffer view

45b100f

jorisvandenbossche reviewed Feb 1, 2024

View reviewed changes

paleolimbot added 5 commits February 1, 2024 11:48

suggestsions

6ff52bc

buffer stuff compiling without error

693b0a8

working buffer

a5cf096

make work with Python 3.8, maybe better checking of buffer flags

19def06

test some buffer things

778ebae

paleolimbot and others added 18 commits February 3, 2024 00:15

more buffer tests

b666fc3

more tests

2c322a5

start on buffer/array builders

d203cde

working on array thing

aa044ac

fix array from pybuffer

712f471

writable buffers

92f3132

a few tests

82dbece

move tests

63613ae

remove duplicate tests, tidy c_lib

6014a95

fix empty array building

8b3d44d

fix buffer setting with empty

3427e7f

actually fix empty array

6dd74f6

fix doctests

b24e743

better error handling

f6b9e59

result -> code

404da09

builder tests

a419807

document build from buffers

de3b550

factor out c_format_from_arrow_type

8c57e06

paleolimbot added 4 commits February 9, 2024 15:29

use root na usage

a936a4b

separate array and buffer tests

3171e1c

isort

e49bbc3

buffer protocol on CBufer

2ced8aa

jorisvandenbossche reviewed Feb 15, 2024

View reviewed changes

paleolimbot and others added 9 commits February 15, 2024 11:26

remove set_xx methods

459c5e4

Update python/src/nanoarrow/c_lib.py

f97fc63

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

Update python/src/nanoarrow/c_lib.py

16a22fa

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

Update python/src/nanoarrow/c_lib.py

b449a9f

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

Update python/src/nanoarrow/c_lib.py

ec7a42e

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

requested_schema -> schema

823f552

make setxx methods internal

9ce4f6d

separate buffer and buffer builder

109231e

error for pybuffer + schema for now

44229e4

jorisvandenbossche reviewed Feb 15, 2024

View reviewed changes

paleolimbot and others added 8 commits February 15, 2024 13:36

maybe better array/buffer

1b4c3c6

sort element vs. item maybe

87364f5

more element tests

d091b45

add requested_schema back in

c22d2d0

Update python/src/nanoarrow/c_lib.py

8bdb881

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

clarify comment

10e0fb9

fix typo

0c46b76

maybe fix array handling

05f104c

jorisvandenbossche reviewed Feb 16, 2024

View reviewed changes

update language

a94f7fd

paleolimbot merged commit 841c845 into apache:main Feb 19, 2024
36 checks passed

paleolimbot deleted the python-buffers branch February 19, 2024 19:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(python): Add array creation/building from buffers #378

feat(python): Add array creation/building from buffers #378

paleolimbot commented Jan 31, 2024 •

edited

Loading

jorisvandenbossche left a comment

jorisvandenbossche Feb 1, 2024

jorisvandenbossche Feb 1, 2024

jorisvandenbossche Feb 1, 2024

paleolimbot Feb 1, 2024

jorisvandenbossche Feb 1, 2024

paleolimbot Feb 1, 2024

codecov-commenter commented Feb 3, 2024 •

edited

Loading

jorisvandenbossche Feb 15, 2024

paleolimbot Feb 15, 2024

jorisvandenbossche Feb 16, 2024

jorisvandenbossche Feb 15, 2024

paleolimbot Feb 15, 2024

jorisvandenbossche Feb 15, 2024

jorisvandenbossche left a comment •

edited

Loading

jorisvandenbossche Feb 15, 2024

jorisvandenbossche Feb 15, 2024

paleolimbot Feb 15, 2024

jorisvandenbossche Feb 15, 2024

jorisvandenbossche Feb 15, 2024

paleolimbot Feb 15, 2024

jorisvandenbossche Feb 15, 2024

paleolimbot Feb 15, 2024

jorisvandenbossche Feb 16, 2024

jorisvandenbossche Feb 16, 2024

paleolimbot Feb 16, 2024

jorisvandenbossche Feb 20, 2024

jorisvandenbossche Feb 16, 2024

paleolimbot Feb 16, 2024

		# Set buffers. This moves ownership of the buffers as well (i.e., the objects
		# in the input buffers are replaced with an empty ArrowBuffer)

		if requested_schema is not None:
		raise NotImplementedError("requested_schema")

		# Flush the buffer address from the buffer into the ArrowArray struct
		self._ptr.buffers[i] = ArrowArrayBuffer(self._ptr, i).data

feat(python): Add array creation/building from buffers #378

feat(python): Add array creation/building from buffers #378

Conversation

paleolimbot commented Jan 31, 2024 • edited Loading

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Feb 3, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paleolimbot commented Jan 31, 2024 •

edited

Loading

codecov-commenter commented Feb 3, 2024 •

edited

Loading

jorisvandenbossche left a comment •

edited

Loading