Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop NumPy build dependency #751

Merged
merged 18 commits into from
Jul 31, 2024

Conversation

jakirkham
Copy link
Member

@jakirkham jakirkham commented Jul 31, 2024

Partially addresses issue: rapidsai/build-planning#82
Partially addresses issue: rapidsai/build-planning#41

Even though cuCIM currently #includes <pybind11/numpy.h>, the actual C++ code appears not to use NumPy. So this attempts to drop the header and the NumPy build dependency.

@jakirkham jakirkham requested review from a team as code owners July 31, 2024 08:01
@jakirkham jakirkham added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Jul 31, 2024
@jakirkham
Copy link
Member Author

jakirkham commented Jul 31, 2024

Looks like this code might need some tweaks if we proceed further

auto arr = pybind11::array_t<int64_t, py::array::c_style | py::array::forcecast>::ensure(location);

Edit: Changes below rewrite this to use a memoryview instead

Copy link
Contributor

@gigony gigony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakirkham ! It looks good to me.

I have a question: Did pybind11/numpy.h depends on numpy header or numpy library?

If this change is about for handling numpy 2,
can upgrading pybind11 library to the latest version helps?
Looks like pybind11 is handling numpy 2 case - https://github.com/pybind/pybind11/blob/master/include/pybind11/numpy.h#L187

@jakirkham
Copy link
Member Author

I have a question: Did pybind11/numpy.h depends on numpy header or numpy library?

A few things to unpack here

NumPy is atypical in its setup. When building against NumPy, one only #includes the NumPy header. There is not a NumPy library that one links against in the typical sense

However the symbols that the NumPy header names are in the Python shared objects that the NumPy package ships. Those symbols get loaded when calling import numpy (there is a similar operation that NumPy supplies for use in C APIs). So this is how the symbols get resolved at runtime

Regardless, from a developer's perspective, building against NumPy always means using the header and the libraries. There isn't a way to pick just one or the other

Using pybind11 for NumPy support is not unique in this regard

If this change is about for handling numpy 2, can upgrading pybind11 library to the latest version helps? Looks like pybind11 is handling numpy 2 case - https://github.com/pybind/pybind11/blob/master/include/pybind11/numpy.h#L187

Yes, it is true that pybind11 2.12.0 ships with NumPy 2 support. Building against that would be sufficient for NumPy 1 & 2 support (without other changes)

That said, there are relatively few cases where the NumPy API is strictly needed. Especially after the introduction of the Python Buffer Protocol. Many use cases (and ours in cuCIM is one of these) simply need a way to access the underlying memory buffer of Python objects (NumPy arrays or otherwise). So in these cases, it is better to use the Python Buffer Protocol directly (as this code change does), which works not only with NumPy arrays, but any object that supports the Python Buffer Protocol. As a result this simplifies our dependencies. Plus this approach is more flexible and interoperable with other libraries

@gigony
Copy link
Contributor

gigony commented Jul 31, 2024

I have a question: Did pybind11/numpy.h depends on numpy header or numpy library?

A few things to unpack here

NumPy is atypical in its setup. When building against NumPy, one only #includes the NumPy header. There is not a NumPy library that one links against in the typical sense

However the symbols that the NumPy header names are in the Python shared objects that the NumPy package ships. Those symbols get loaded when calling import numpy (there is a similar operation that NumPy supplies for use in C APIs). So this is how the symbols get resolved at runtime

Regardless, from a developer's perspective, building against NumPy always means using the header and the libraries. There isn't a way to pick just one or the other

Using pybind11 for NumPy support is not unique in this regard

If this change is about for handling numpy 2, can upgrading pybind11 library to the latest version helps? Looks like pybind11 is handling numpy 2 case - https://github.com/pybind/pybind11/blob/master/include/pybind11/numpy.h#L187

Yes, it is true that pybind11 2.12.0 ships with NumPy 2 support. Building against that would be sufficient for NumPy 1 & 2 support (without other changes)

That said, there are relatively few cases where the NumPy API is strictly needed. Especially after the introduction of the Python Buffer Protocol. Many use cases (and ours in cuCIM is one of these) simply need a way to access the underlying memory buffer of Python objects (NumPy arrays or otherwise). So in these cases, it is better to use the Python Buffer Protocol directly (as this code change does), which works not only with NumPy arrays, but any object that supports the Python Buffer Protocol. As a result this simplifies our dependencies. Plus this approach is more flexible and interoperable with other libraries

Thanks @jakirkham for the comprehensive explanation!
It makes sense, and thank you for the update! 🙂

@jakirkham
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 80c27d6 into rapidsai:branch-24.08 Jul 31, 2024
46 checks passed
@jakirkham jakirkham deleted the drop_np_pin_compat branch July 31, 2024 22:23
@jakirkham
Copy link
Member Author

Thanks all! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants