-
Notifications
You must be signed in to change notification settings - Fork 7k
[Data][CI] Add fix_block.py back to CI #57841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data][CI] Add fix_block.py back to CI #57841
Conversation
Signed-off-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com>
bveeramani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@owenowenisme I don't think ChunkedArray supports zero_copy_only for PyArrow 9, and I think self._column can be either an Array or ChunkedArray: https://arrow.apache.org/docs/9.0/python/generated/pyarrow.ChunkedArray.html#pyarrow.ChunkedArray.to_numpy.
Also, how does zero_copy_only=False fix the test failure, or is that change unrelated?
|
Changed to not pass zero_copy_only when the instance is a ChunkedArray on PyArrow versions prior to 13.0.0. |
|
|
||
| def to_numpy(self, zero_copy_only: bool = False) -> np.ndarray: | ||
| # NOTE: Pyarrow < 13.0.0 does not support ``zero_copy_only`` | ||
| if get_pyarrow_version() < _MIN_PYARROW_VERSION_TO_NUMPY_ZERO_COPY_ONLY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Might be worth renaming _MIN_PYARROW_VERSION_TO_NUMPY_ZERO_COPY_ONLY to clarify that it only applies to ChunkedArray, but I guess the name would get pretty verbose.
Will defer to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_MIN_PYARROW_VERSION_TO_NUMPY_ZERO_COPY_ONLY is single usage at this time, I think we can leave it as it is for now.
## Description https://arrow.apache.org/docs/python/generated/pyarrow.Array.html#pyarrow.Array.to_numpy <img width="772" height="270" alt="Screenshot 2025-10-18 at 3 14 36 PM" src="https://github.com/user-attachments/assets/d9cbf986-4271-41e6-9c4c-96201d32d1c6" /> `zero_copy_only` is actually default to True, so we should explicit pass False, for pyarrow version < 13.0.0 https://github.com/ray-project/ray/blob/1e38c9408caa92c675f0aa3e8bb60409c2d9159f/python/ray/data/_internal/arrow_block.py#L540-L546 ## Related issues Closes ray-project#57819 ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com> Signed-off-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local> Signed-off-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com> Signed-off-by: You-Cheng Lin <mses010108@gmail.com> Co-authored-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local> Signed-off-by: xgui <xgui@anyscale.com>
## Description https://arrow.apache.org/docs/python/generated/pyarrow.Array.html#pyarrow.Array.to_numpy <img width="772" height="270" alt="Screenshot 2025-10-18 at 3 14 36 PM" src="https://github.com/user-attachments/assets/d9cbf986-4271-41e6-9c4c-96201d32d1c6" /> `zero_copy_only` is actually default to True, so we should explicit pass False, for pyarrow version < 13.0.0 https://github.com/ray-project/ray/blob/1e38c9408caa92c675f0aa3e8bb60409c2d9159f/python/ray/data/_internal/arrow_block.py#L540-L546 ## Related issues Closes ray-project#57819 ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com> Signed-off-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local> Signed-off-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com> Signed-off-by: You-Cheng Lin <mses010108@gmail.com> Co-authored-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local>
## Description https://arrow.apache.org/docs/python/generated/pyarrow.Array.html#pyarrow.Array.to_numpy <img width="772" height="270" alt="Screenshot 2025-10-18 at 3 14 36 PM" src="https://github.com/user-attachments/assets/d9cbf986-4271-41e6-9c4c-96201d32d1c6" /> `zero_copy_only` is actually default to True, so we should explicit pass False, for pyarrow version < 13.0.0 https://github.com/ray-project/ray/blob/1e38c9408caa92c675f0aa3e8bb60409c2d9159f/python/ray/data/_internal/arrow_block.py#L540-L546 ## Related issues Closes ray-project#57819 ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com> Signed-off-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local> Signed-off-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com> Signed-off-by: You-Cheng Lin <mses010108@gmail.com> Co-authored-by: You-Cheng Lin <youchenglin@youchenglin-L3DPGF50JG.local> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Description
https://arrow.apache.org/docs/python/generated/pyarrow.Array.html#pyarrow.Array.to_numpy

zero_copy_onlyis actually default to True, so we should explicit pass False, for pyarrow version < 13.0.0ray/python/ray/data/_internal/arrow_block.py
Lines 540 to 546 in 1e38c94
Related issues
Closes #57819
Additional information