Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiling two modules with different compilers leads to segfault in class.h #1262

Closed
goldsborough opened this issue Jan 26, 2018 · 13 comments
Closed

Comments

@goldsborough
Copy link

goldsborough commented Jan 26, 2018

We have a library that uses pybind11 to wrap its internal C++ code. We now also want to allow external extension modules to be usable with the library. However, we are noticing that when our library is built with one compiler and an extension module with another, there is a segfault within pybind11 upon import.

I am able to reproduce the bug with a small example:

a.cpp (imagine our library)

#include <pybind11/pybind11.h>

namespace py = pybind11;

struct A {
  explicit A(int y) : _y(y) {}
  int f(int x) { return x + _y; }
  int _y;
};

PYBIND11_MODULE(a, m) {
  py::class_<A>(m, "A").def(py::init<int>()).def("f", &A::f);
}

b.cpp (imagine an extension)

#include <pybind11/pybind11.h>

namespace py = pybind11;

struct B {
  explicit B(int y) : _y(y) {}
  int f(int x) { return x + _y; }
  int _y;
};

PYBIND11_MODULE(b, m) {
  py::class_<B>(m, "B").def(py::init<int>()).def("f", &B::f);
}

setup_a.py

from setuptools import setup, Extension
from setuptools.command.build_ext import build_ext

ext_modules = [
    Extension('a', ['a.cpp'], include_dirs=['../include'], language='c++'),
]


class BuildExtension(build_ext):
    """A custom build extension for adding compiler-specific options."""

    def build_extensions(self):
        for extension in self.extensions:
            extension.extra_compile_args = ['-g', '-std=c++11']
        build_ext.build_extensions(self)


setup(
    name='a', ext_modules=ext_modules, cmdclass={
        'build_ext': BuildExtension
    })

setup_b.py

from setuptools import setup, Extension
from setuptools.command.build_ext import build_ext

ext_modules = [
    Extension('b', ['b.cpp'], include_dirs=['../include'], language='c++'),
]


class BuildExtension(build_ext):
    """A custom build extension for adding compiler-specific options."""

    def build_extensions(self):
        for extension in self.extensions:
            extension.extra_compile_args = ['-g', '-std=c++11']
        build_ext.build_extensions(self)


setup(
    name='b',
    ext_modules=ext_modules,
    cmdclass={
        'build_ext': BuildExtension
    })

Then:

  1. CXX=clang++ CC=clang python setup_a.py install
  2. CXX=g++-7 CC=gcc-7 python setup_b.py install

Then:

$ lldb python
(lldb) target create "python"
iCurrent executable set to 'python' (x86_64).
(lldb) run
imProcess 61507 launched: '/Users/psag/home/play/x/pybind11/env/bin/python' (x86_64)
impPython 3.5.1 (default, Jan 24 2016, 13:26:48)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
o>>> import a
>>> import b
Process 63580 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x130)
    frame #0: 0x0000000102ac58ea b.cpython-35m-darwin.so`pybind11::detail::make_new_python_type(rec=0x00007fff5fbfdf60) at class.h:564
   561 	    auto metaclass = rec.metaclass.ptr() ? (PyTypeObject *) rec.metaclass.ptr()
   562 	                                         : internals.default_metaclass;
   563
-> 564 	    auto heap_type = (PyHeapTypeObject *) metaclass->tp_alloc(metaclass, 0);
   565 	    if (!heap_type)
   566 	        pybind11_fail(std::string(rec.name) + ": Unable to create type object!");
   567
Target 0: (python) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x5)
  * frame #0: 0x0000000000000005
    frame #1: 0x000000010236f3b2 b.cpython-35m-darwin.so`pybind11::detail::make_new_python_type(rec=0x00007fff5fbfe000) at class.h:564
    frame #2: 0x0000000102375ac6 b.cpython-35m-darwin.so`pybind11::detail::generic_type::initialize(this=0x00007fff5fbfdff8, rec=0x00007fff5fbfe000) at pybind11.h:887
    frame #3: 0x00000001023762e1 b.cpython-35m-darwin.so`::PyInit_b() [inlined] _ZN8pybind116class_I1BJEEC4IJNS_9metaclassEEEENS_6handleEPKcDpRKT_((null)=<unavailable>, name=<unavailable>, scope=handle @ 0x00007fde97dcf910, this=0x00007fff5fbfdff8) at pybind11.h:1065
    frame #4: 0x0000000102376260 b.cpython-35m-darwin.so`::PyInit_b() [inlined] pybind11_init_b(m=<unavailable>)
    frame #5: 0x0000000102376260 b.cpython-35m-darwin.so`::PyInit_b()
    frame #6: 0x000000010014b844 Python`_PyImport_LoadDynamicModuleWithSpec + 489
    frame #7: 0x000000010014b3f6 Python`_imp_create_dynamic + 252
    frame #8: 0x00000001000d19d5 Python`PyCFunction_Call + 273
    frame #9: 0x00000001001359bc Python`PyEval_EvalFrameEx + 24272
    frame #10: 0x00000001001386f0 Python`_PyEval_EvalCodeWithName + 1884
    frame #11: 0x000000010013902f Python`fast_function + 341
    frame #12: 0x0000000100135100 Python`PyEval_EvalFrameEx + 22036
    frame #13: 0x0000000100138faf Python`fast_function + 213
    frame #14: 0x0000000100135100 Python`PyEval_EvalFrameEx + 22036
    frame #15: 0x0000000100138faf Python`fast_function + 213
    frame #16: 0x0000000100135100 Python`PyEval_EvalFrameEx + 22036
    frame #17: 0x0000000100138faf Python`fast_function + 213
    frame #18: 0x0000000100135100 Python`PyEval_EvalFrameEx + 22036
    frame #19: 0x0000000100138faf Python`fast_function + 213
    frame #20: 0x0000000100135100 Python`PyEval_EvalFrameEx + 22036
    frame #21: 0x00000001001386f0 Python`_PyEval_EvalCodeWithName + 1884
    frame #22: 0x000000010012fad7 Python`PyEval_EvalCodeEx + 78
    frame #23: 0x00000001000babb0 Python`function_call + 377
    frame #24: 0x000000010009905e Python`PyObject_Call + 97
    frame #25: 0x00000001000998b7 Python`_PyObject_CallMethodIdObjArgs + 197
    frame #26: 0x000000010014a8b0 Python`PyImport_ImportModuleLevelObject + 1780
    frame #27: 0x000000010012cbc8 Python`builtin___import__ + 135
    frame #28: 0x00000001000d1900 Python`PyCFunction_Call + 60
    frame #29: 0x000000010009905e Python`PyObject_Call + 97
    frame #30: 0x0000000100137f38 Python`PyEval_CallObjectWithKeywords + 165
    frame #31: 0x0000000100133a10 Python`PyEval_EvalFrameEx + 16164
    frame #32: 0x00000001001386f0 Python`_PyEval_EvalCodeWithName + 1884
    frame #33: 0x000000010012fa83 Python`PyEval_EvalCode + 81
    frame #34: 0x0000000100155461 Python`run_mod + 58
    frame #35: 0x000000010015522e Python`PyRun_InteractiveOneObject + 569
    frame #36: 0x0000000100154b88 Python`PyRun_InteractiveLoopFlags + 209
    frame #37: 0x0000000100154a84 Python`PyRun_AnyFileExFlags + 60
    frame #38: 0x0000000100168d72 Python`Py_Main + 3430
    frame #39: 0x0000000100001e27 python`___lldb_unnamed_symbol1$$python + 224
    frame #40: 0x00007fffa5a38235 libdyld.dylib`start + 1
    frame #41: 0x00007fffa5a38235 libdyld.dylib`start + 1

It seems the metaclass variable is nullptr in this case. This can be confirmed by putting an assertion into that location in class.h.

This is on macOS Sierra, but we see the same on Linux. We also observe this for certain combinations of different GCC versions. In the example above I use Python 3.5, but the same is observable for Python 3.6 and Python 2.7.

@JoelStienlet
Copy link

In general you want everything to be built with the same compiler, and same version of that compiler, see:
https://stackoverflow.com/questions/23895081/can-you-mix-c-compiled-with-different-versions-of-the-same-compiler

@wjakob
Copy link
Member

wjakob commented Jan 26, 2018

STL internal data structures are not guaranteed to be compatible across compiler major versions, and definitely not across entirely different compilers. Pybind11 uses STL data structures to organize its internal state, hence it is important that extension modules are also compiled with the same compiler (otherwise, all sorts of corruption can occur).

@wjakob wjakob closed this as completed Jan 26, 2018
@zdevito
Copy link
Contributor

zdevito commented Jan 26, 2018

How do you recommend shipping binary python extension modules that use pybind11 if you don't know what other extensions a user might have installed that may use pybind11 and were built with another compiler version?

Is there a way to completely isolate the pybind11 state across these extensions?

@jagerman
Copy link
Member

The compiler version isn't quite as critical as the STL (and its version). STL versions are usually, but not always, backwards-compatible with previous versions of the same STL. For instance, you're usually fine mixing modules built with gcc-5/gcc-6/gcc-7/gcc-8/clang-* on linux, since they all use gcc's stdlibc++. Mixing any of those with clang using libc++—which is the default when using clang under macOS, but not on Linux—is asking for trouble. Very rarely the stl breaks backwards compatibility—IIRC, the last time for stdlibc++ was when version 5 came out (and was related to C++11 compatibility), so crossing the pre-5 and post-5 gcc boundary is likely another no-no.

What you're getting trying to load one so built with g++/stdlibc++ and another built with clang++/libc++ at the same time in the same binary is just something that can't work in any C++ code making use of the stl.

The only way around it is really to isolate the software: keep all your g++/stdlibc++-compiled code separate from your clang++/libc++-compiled code. And while that is a nuissance, it's not something that pybind can realistically do anything about.

@wjakob
Copy link
Member

wjakob commented Sep 19, 2019

Dear all,

I've realized that this has become a bit of a painful problem, particularly when installing external packages where one may not have control over what compiler is being used.

The following commit, currently on master, namespaces pybind11's internal data structures based on the value of the __GXX_ABI_VERSION flag, if present.

bdf1a2c

My hope is that this should avoid this kind of breakage in the future. For those of you who are affected, could you let me know if this addresses the problem? My plan then would be to push this into a patch release of pybind11.

Best,
Wenzel

@wjakob wjakob reopened this Sep 19, 2019
@snnn
Copy link
Contributor

snnn commented Sep 19, 2019

I'll try it today. Thanks for your help

@snnn
Copy link
Contributor

snnn commented Sep 19, 2019

It solved my problem. Thanks!

@snnn
Copy link
Contributor

snnn commented Sep 19, 2019

Hi @wjakob , do you have a schedule for the patch release?

@wjakob
Copy link
Member

wjakob commented Sep 19, 2019

I've added another commit that provides an even stricter separation: c9f5a46

@wjakob
Copy link
Member

wjakob commented Sep 19, 2019

Released in v2.4.0 now :)

@wjakob
Copy link
Member

wjakob commented Sep 19, 2019

(not a patch release after all, because there are also some minor new features)

guillaumekln added a commit to OpenNMT/CTranslate2 that referenced this issue Sep 23, 2019
The open source package pyonmttok was recently updated to use
pybind11. However, compiling 2 pybind11 packages with different
toolchains caused a segmentation fault when used in the same
project. See for example
pybind/pybind11#1262

For now, we revert this internal package back to Boost.Python.
rgommers added a commit to rgommers/scipy-wheels that referenced this issue Dec 17, 2019
See scipy/scipy#11237 and
pybind/pybind11#1262 for details.

tl;dr pybind11 is the first version where its symbol names are prefixed
with the GCC (or Clang) version it was compiled with.
rgommers added a commit to rgommers/scipy-wheels that referenced this issue Dec 17, 2019
See scipy/scipy#11237 and
pybind/pybind11#1262 for details.

tl;dr pybind11 is the first version where its symbol names are prefixed
with the GCC (or Clang) version it was compiled with.
tylerjereddy pushed a commit to tylerjereddy/scipy-wheels that referenced this issue Dec 17, 2019
See scipy/scipy#11237 and
pybind/pybind11#1262 for details.

tl;dr pybind11 is the first version where its symbol names are prefixed
with the GCC (or Clang) version it was compiled with.
rgommers pushed a commit to MacPython/scipy-wheels that referenced this issue Dec 17, 2019
See scipy/scipy#11237 and
pybind/pybind11#1262 for details.

tl;dr pybind11 is the first version where its symbol names are prefixed
with the GCC (or Clang) version it was compiled with.
@lyskov
Copy link
Contributor

lyskov commented Feb 14, 2020

Thank you for fixing this @wjakob ! I can confirm that this patch worked for us as well.

@bstaletic
Copy link
Collaborator

This seems resolved. If more stuff needs to be done in this regard, please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants