Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add numpy2 support #2449

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open

Add numpy2 support #2449

wants to merge 48 commits into from

Conversation

2maz
Copy link

@2maz 2maz commented Jan 16, 2025

This pull request intends to enable the use of numpy2 (#2446 #2425) with vaex.
In addition it applies some other fixes.

An (incomplete) summary:

Note, that I encountered some broken test(s) in cmodule.py test_interpolate and those related to Webserver:

  • this looked like it was in a broken state already, so left it for now
    Some format changes from tab to space indent have been applied.

Please let me know if (further) changes are required - only validated on Linux system right now.
I ran tests in vaex-core/vaex/test - any more instructions on how to run you full test-suite (apart from ci)?

Thx.

@maartenbreddels @ddelange

@2maz
Copy link
Author

2maz commented Jan 17, 2025

@2maz
Copy link
Author

2maz commented Jan 20, 2025

Great progress! ❤️ Is the plan to also maintain numpy v1 runtime support?

For now, yes, should still run with v1, and so far it does considering ci and 3.9 (uses numpy==1.26.4)

2maz added 4 commits January 21, 2025 15:08
…find_edges

percentile_approx_test fails with nan in the result array, caused by vaex.vaexfast.grid_find_edges returning
edges_floor / edges_ceil with [from, to] where from == to
@2maz
Copy link
Author

2maz commented Jan 23, 2025

All outstanding issues related to numpy v2 (and some not) are resolved - only catboost dependent functionality is conditionally disabled.

All pipelines run fine - even some previously skipped tests.
Any comments: @ddelange @maartenbreddels @JovanVeljanoski

@ddelange
Copy link
Contributor

awesome work! ❤️

@maartenbreddels if we get lucky, the pybind submodule update in this PR will also solve #2442

except ValueError as e:
import re
if re.match(r'numpy.dtype size changed', str(e)):
pytestmark = pytest.mark.skip(f"all tests skipped due to: numpy2 incompatibility -- {e}")
Copy link
Contributor

@ddelange ddelange Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if catboost currently pins numpy v1, and our CI installs vaex-ml[all], then CI should be running with numpy v1. then why is this change necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catboost does not support v2, yet they do not pin to v1 - it can be installed along with numpy v2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right. they do have an upper cap as of 1.2.6 (catboost/catboost@4b1b8ae), but we don't have a lower pin on catboost in vaex-ml, so CI is now happily installing catboost 1.2.5 alongside numpy v2.

this should resolve itself once catboost releases numpy v2 support. can you add the following inline comment?

# this try-except can be removed once catboost releases numpy v2 support
# ref https://github.com/catboost/catboost/issues/2671
# ref https://github.com/vaexio/vaex/pull/2449

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be precise the latest catboost version 1.2.7 comes with install_requires: 'numpy (>=1.16.0, <2.0)', while catboost 1.2.5 (installed in ci) comes with install_requires: 'numpy (>=1.16.0)'

Comment on lines 88 to 91
- name: Install libpcre3-dev
if: matrix.os == 'ubuntu-latest'
run: |
sudo apt update & sudo apt install -y libpcre3-dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have bin/install_pcre.sh already just above:

- name: Setup PCRE (Nix-only)
if: matrix.os != 'windows-latest'
run: sudo -E bash bin/install_pcre.sh

is it broken?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see https://github.com/vaexio/vaex/actions/runs/12819691425/job/35747856845

E ImportError: libpcrecpp.so.0: cannot open shared object file: No such file or directory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. The wheels depend on that script:

CIBW_BEFORE_BUILD: ${{ startswith(matrix.os, 'ubuntu') && 'bash bin/install_pcre.sh' || startswith(matrix.os, 'macos') && 'sudo -E bash bin/install_pcre.sh' || '' }}

So maybe the wheels are broken and this smoke test doesn't catch it?

CIBW_TEST_COMMAND: python -c "import vaex; print(vaex.from_arrays(x=[1,2]))"

What do you think about adding the apt-get to the install script? Maybe similar to this one, preferring the precompiled one over the source install? https://github.com/ahupp/python-magic/blob/65fb61c9c9aa6348bb95d1dd71b685720c2a8a23/add_libmagic.sh#L69

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it to the script.
The source-built libraries are in /usr/local/lib. So LD_LIBRARY_PATH or runtime search path for the extension could be set otherwise.

Copy link
Contributor

@ddelange ddelange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thanks for the quick action!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants