Skip to content

Conversation

@owenowenisme
Copy link
Member

@owenowenisme owenowenisme commented Oct 17, 2025

Description

https://arrow.apache.org/docs/python/generated/pyarrow.Array.html#pyarrow.Array.to_numpy
Screenshot 2025-10-18 at 3 14 36 PM

zero_copy_only is actually default to True, so we should explicit pass False, for pyarrow version < 13.0.0

def to_numpy(self, zero_copy_only: bool = False) -> np.ndarray:
# NOTE: Pyarrow < 13.0.0 does not support ``zero_copy_only``
if get_pyarrow_version() < _MIN_PYARROW_VERSION_TO_NUMPY_ZERO_COPY_ONLY:
return self._column.to_numpy()
return self._column.to_numpy(zero_copy_only=zero_copy_only)

Related issues

Closes #57819

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
@owenowenisme owenowenisme added the go add ONLY when ready to merge, run all tests label Oct 17, 2025
You-Cheng Lin and others added 2 commits October 18, 2025 15:15
Signed-off-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local>
Signed-off-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com>
@owenowenisme owenowenisme marked this pull request as ready for review October 18, 2025 08:55
@owenowenisme owenowenisme requested a review from a team as a code owner October 18, 2025 08:55
@ray-gardener ray-gardener bot added the data Ray Data-related issues label Oct 18, 2025
Copy link
Member

@bveeramani bveeramani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@owenowenisme I don't think ChunkedArray supports zero_copy_only for PyArrow 9, and I think self._column can be either an Array or ChunkedArray: https://arrow.apache.org/docs/9.0/python/generated/pyarrow.ChunkedArray.html#pyarrow.ChunkedArray.to_numpy.

Also, how does zero_copy_only=False fix the test failure, or is that change unrelated?

Signed-off-by: You-Cheng Lin <mses010108@gmail.com>
@owenowenisme
Copy link
Member Author

owenowenisme commented Oct 23, 2025

Changed to not pass zero_copy_only when the instance is a ChunkedArray on PyArrow versions prior to 13.0.0.


def to_numpy(self, zero_copy_only: bool = False) -> np.ndarray:
# NOTE: Pyarrow < 13.0.0 does not support ``zero_copy_only``
if get_pyarrow_version() < _MIN_PYARROW_VERSION_TO_NUMPY_ZERO_COPY_ONLY:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Might be worth renaming _MIN_PYARROW_VERSION_TO_NUMPY_ZERO_COPY_ONLY to clarify that it only applies to ChunkedArray, but I guess the name would get pretty verbose.

Will defer to you

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_MIN_PYARROW_VERSION_TO_NUMPY_ZERO_COPY_ONLY is single usage at this time, I think we can leave it as it is for now.

@bveeramani bveeramani merged commit b6e8467 into ray-project:master Oct 23, 2025
6 checks passed
xinyuangui2 pushed a commit to xinyuangui2/ray that referenced this pull request Oct 27, 2025
## Description

https://arrow.apache.org/docs/python/generated/pyarrow.Array.html#pyarrow.Array.to_numpy
<img width="772" height="270" alt="Screenshot 2025-10-18 at 3 14 36 PM"
src="https://github.com/user-attachments/assets/d9cbf986-4271-41e6-9c4c-96201d32d1c6"
/>

`zero_copy_only` is actually default to True, so we should explicit pass
False, for pyarrow version < 13.0.0

https://github.com/ray-project/ray/blob/1e38c9408caa92c675f0aa3e8bb60409c2d9159f/python/ray/data/_internal/arrow_block.py#L540-L546

## Related issues
Closes ray-project#57819

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
Signed-off-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local>
Signed-off-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com>
Signed-off-by: You-Cheng Lin <mses010108@gmail.com>
Co-authored-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local>
Signed-off-by: xgui <xgui@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
## Description


https://arrow.apache.org/docs/python/generated/pyarrow.Array.html#pyarrow.Array.to_numpy
<img width="772" height="270" alt="Screenshot 2025-10-18 at 3 14 36 PM"
src="https://github.com/user-attachments/assets/d9cbf986-4271-41e6-9c4c-96201d32d1c6"
/>


`zero_copy_only` is actually default to True, so we should explicit pass
False, for pyarrow version < 13.0.0

https://github.com/ray-project/ray/blob/1e38c9408caa92c675f0aa3e8bb60409c2d9159f/python/ray/data/_internal/arrow_block.py#L540-L546

## Related issues
Closes ray-project#57819

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
Signed-off-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local>
Signed-off-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com>
Signed-off-by: You-Cheng Lin <mses010108@gmail.com>
Co-authored-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
## Description

https://arrow.apache.org/docs/python/generated/pyarrow.Array.html#pyarrow.Array.to_numpy
<img width="772" height="270" alt="Screenshot 2025-10-18 at 3 14 36 PM"
src="https://github.com/user-attachments/assets/d9cbf986-4271-41e6-9c4c-96201d32d1c6"
/>

`zero_copy_only` is actually default to True, so we should explicit pass
False, for pyarrow version < 13.0.0

https://github.com/ray-project/ray/blob/1e38c9408caa92c675f0aa3e8bb60409c2d9159f/python/ray/data/_internal/arrow_block.py#L540-L546

## Related issues
Closes ray-project#57819

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
Signed-off-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local>
Signed-off-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com>
Signed-off-by: You-Cheng Lin <mses010108@gmail.com>
Co-authored-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Data] test_block.py is broken - not tested in CI

2 participants