GitHub - LAIRLAB/libpyarr: C++ interface to numpy arrays, as well as Boost.Python utilities

LAIRLAB / libpyarr Public

Notifications You must be signed in to change notification settings
Fork 0
Star 1

C++ interface to numpy arrays, as well as Boost.Python utilities

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 278 Commits
CMakeLists.txt		CMakeLists.txt
INSTALL		INSTALL
LICENSE.txt		LICENSE.txt
README		README
__init__.py		__init__.py
boilerplate.cpp		boilerplate.cpp
boilerplate.h		boilerplate.h
boost_common.cc		boost_common.cc
boost_common.h		boost_common.h
boost_util.cmake		boost_util.cmake
boost_util.py		boost_util.py
bounding_box.h		bounding_box.h
build.py		build.py
code_gen.py		code_gen.py
common.cmake		common.cmake
env.sh		env.sh
example.cc		example.cc
example.py		example.py
find_epd.py		find_epd.py
gen_boilerplate.py		gen_boilerplate.py
gil_release.h		gil_release.h
misc.cpp		misc.cpp
misc.h		misc.h
mymex.cc		mymex.cc
mymex.h		mymex.h
pyarr.h		pyarr.h
pyarr_test.py		pyarr_test.py
pyarr_to_v.cc		pyarr_to_v.cc
pyarr_to_v.h		pyarr_to_v.h
pyarr_to_v.h~		pyarr_to_v.h~
pyarr_to_v.py		pyarr_to_v.py
pyarr_to_v.py~		pyarr_to_v.py~
pyarr_to_v_test.py		pyarr_to_v_test.py
r		r
run_tests.sh		run_tests.sh
t_code_gen.py		t_code_gen.py
to_python_converters.cc		to_python_converters.cc
typedef.h		typedef.h
using_std.h		using_std.h
util.h		util.h
x_test_x.py		x_test_x.py

Repository files navigation

See INSTALL for installation instructions. 

libpyarr makes it embarrassingly easy to:
- write some Python code that uses numpy arrays
- discover that part of it is slow
- rewrite that part in C++
- call the new C++ function, passing in numpy arrays and/or returning them without excessive copying. 

It is sufficiently convenient to do this that users of this library tend to completely cease
writing top-level executables in C++.

So, for example, if you would like to pass a numpy array of int32's into a C++ function, you can have
a C++ module (see example.cc and example.py for this exact example compiled into code):

#include <boost_common.h>
==============================
void foo(pyarr<int> myarr) 
{
    for (int i=0; i<myarr.dims[0]; i++) {
        for (int j=0; j<myarr.dims[2]; j++) {
            myarr[ind(i,j)] = 0.0;
        }
    }
}
==============================
The ind(i,j) syntax works for arrays of up to 4 dimensions; above that you can 
access the data pointer directly with myarr.data. 

To create a new uninitialized numpy array in C++ and return it: 
==============================
pyarr<int> bar() 
{
    long int dims[] = {10, 20, 30};
    pyarr<int> ret(3, dims);
    return ret;
}
==============================
To expose these functions to Python:
==============================
BOOST_PYTHON_MODULE(libpyarr_example)
{
    import_array(); /* THESE ARE IMPORTANT */
    boost_common(); /* YOU WILL INEXPLICABLY SEGFAULT WITHOUT THESE */
    def("foo", foo);
    def("bar", bar);
}
==============================
Then in your CMakeLists.txt:
==============================
include(common.cmake)
include(boost_util.cmake)

add_library(pyarr_example example.cc)
set_target_properties(pyarr_example PROPERTIES SUFFIX ".so") 
# need the above or else it will be called a .dylib on OS X and python won't import it
target_link_libraries(pyarr_example boost_common ${Boost_LIBRARIES} ${PYTHON_LIBRARIES})
==============================
==============================

You can use other build systems if you like; just look in common.cmake and boost_common.cmake
to see what directories to include and link with and what libraries to link to.

If you do use cmake, your new library will be in lib/, and you can either add an __init__.py 
file to your lib/ directory, copy said library somewhere else, or add your lib/ directory to 
your $PYTHONPATH. 
Import it with 'import libpyarr_example'. You will see a bunch of harmless warnings; to 
suppress these, do
==============================
import warnings
with warnings.catch_warnings():
     warnings.simplefilter("ignore")
     import libpyarr_example
==============================

IMPORTANT NOTES:
- When you pass a pyarr by value from anywhere to anywhere else, the array data is NOT COPIED. 
Normally the Python reference count is just managed in a sensible way by the constructors and
destructor. If you would like a copy, you can say mypyarr.copy(). For this reason, it is 
recommended that you never pass pyarrs by reference or by pointer, because there is no benefit
to doing so, and you risk messing up the refcounts. 

- You cannot create a view into a numpy array (e.g. with myarr[::-1, ...]) and pass it into a 
C++ function expecting a pyarr. If you have a view into an array and you would like to pass it, 
you can create a copy of it with .copy() in Python and then pass the copy in. 

- You have to have the import_array(); call somewhere in EACH separate compilation unit (usually
each .cpp or .cc file). If you don't, you will inexplicably segfault. Common practice is to have 
each .cpp file be its own Python module, but that part isn't required. (This is terrible, and
a long time was spent trying to get around it, so if you fix it, please tell me how and/or let 
me pull from you.)

==============================
==============================
At this point, you may be wondering, as others have, why not use something like the below? 

https://github.com/personalrobotics/Boost.NumPy

The answer is that after some investigation, it is terribly inconvenient to use,
and is in fact intended that way.

From the Boost.NumPy README: "This is also not intended to be a
high-level C++ array library; it would be more accurate to consider it
a C++ NumPy API, with the intent of making the NumPy C-API available
in a form that is safer and more convenient for C++ users (and
particularly those using Boost.Python, of course)."

And, if you look at the second example in
https://github.com/personalrobotics/Boost.NumPy/blob/master/libs/numpy/example/wrap.cpp 
you see that it is basically identical in functionality to the first
of the two libpyarr examples above (not including the CMakeLists.txt), 
but takes 34 lines to do what libpyarr does in 14. The difference is 
basically entirely crazy boilerplate, which you have to pay again and 
again every time you pass a numpy array to a C++ function.

The main point of libpyarr is to eliminate that boilerplate (in
significant part by automatically generating it and shoving it into
the converter, where you can't see it and don't have to ever write it
yourself), so that the overhead of taking a tiny bit of Python code
and dropping it into C++ is as low as possible. This makes it easier
to think about code even if you write things the same way, but also it
means that you actually end up doing things very differently.

If you know you're going to have to do a ton of boilerplate in order
to speed things up, you are likely to either a) accept them being
slow, or more likely b) write way too much stuff in C++ preemptively,
so as to minimize the overhead, making your code bigger, harder to
change, and harder to think about. Conversely, if you know it's easy,
you can blithely write your code however you like in Python, and
convert small bits to C++ on the fly.

I should say also that at least in that example, Boost.NumPy appears
to be more or less equivalent to (although slightly more obfuscated
than) passing a PyObject*, which Boost.Python will already allow you
to do, and then using the raw numpy C API, which is what I used to do
before libpyarr. You can certainly write code that way, but it's not
happy or recommended.