Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contribute zero-copy sparse matrices to pybind11? #35

Open
matthewwardrop opened this issue Nov 16, 2018 · 1 comment
Open

Contribute zero-copy sparse matrices to pybind11? #35

matthewwardrop opened this issue Nov 16, 2018 · 1 comment

Comments

@matthewwardrop
Copy link

Hi @fwilliams !

This project looks really cool, and it looks like it has (or will have?) a feature that I've been looking for: zero-copy binding of scipy sparse matrices to eigen's SparseMatrix. [Dense matrix mapping is already implemented in PyBind11].

I was wondering whether it makes sense for you to contribute this upstream into PyBind11, where it seems they would accept contributions for this (pybind/pybind11#1022).

Or perhaps I'm completely missing the point of the project :).

In any case, nice work so far!

@fwilliams
Copy link
Owner

Hey Matt, thanks for the interest in NumpyEigen! At the present time, I have built-in support for zero-copy overhead for SciPy sparse types. There are some differences between NumpyEigen and pybind11 that make general zero-copy binding a challenge (In fact, dense Matrices in pybind11 are very often copied under the hood). NumpyEigen sidesteps the limitations of Pybind11 via the introduction of an extra compilation step. Here are some details:

What problem is NumpyEigen solving?

NumpyEigen was built to resolve the incompatibility between runtime and compile time type information:

In C++, the user specifies the scalar type, and memory layout of a Matrix at compile time. This type information is used by Eigen's expression template system to choose the fastest numerical algorithms for operations on matrices. In contrast, NumPy/SciPy we can only know the scalar type and memory layout at runtime.

To have zero copy overhead to/from C++ and Python, I need to know exactly how the memory of a NumPy array is laid out, and what the corresponding layout of the Eigen type is. If the Eigen Matrix type is not compatible with the input array, then it is impossible to have zero-copy binding. For example, if your Eigen function expects a MatrixXd, and you pass a numpy array of float32s, then we need to cast the float array to double, effectively making a copy to double the width of each scalar.

Since Pybind11 is a plain C++ library, it will do zero-copy binding from C++ to Python (not the other way) when the input types in Python and C++ exactly match, and copy otherwise. There are mechanisms in pybind (e.g. templating an array input array<T>) to enforce the user pass in a specific type, but these are restrictive and don't allow overloads.

How NumpyEigen works in a nutshell

NumpyEigen gets around the limitations of pybind11 via the introduction of a compilation layer.

In NumpyEigen, the user specifies a range of allowable scalar types for an input array. Using these types, the NumpyEigen compiler generates a switch statement over all valid scalar types and layouts. In each branch of the switch, you have access to macros defining the required type information for Eigen at compile time. This allows you to write generic Eigen code which is callable from python and for which the memory layout can be determined at runtime.

Therefore, with NumpyEigen, you get the best of both worlds: Fast Eigen expression templates, as well as overloading and dynamic typing. Furthermore the npe:move macro lets you automatically return Eigen types to python with zero copy.

What could be ported to pybind11

Pybind11 could have an interface for sparse matrices that is the same as for dense matrices: i.e. a best-effort zero-copy and templated types to restrict the inputs. I may do this if I have time.

Where is NumpyEigen used?

Right now, NumpyEigen is used primarily for the new Python bindings of libIGL. I have also been using it to write quick wrappers for C++ libraries that I need in Python (e.g. (py-sample-mesh)[https://github.com/fwilliams/py-sample-mesh]). It's proven to be a really effective tool for me, though even though it's still in Beta. If you're interested in using it for your project, I'd be more than happy to help you get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants