-
Notifications
You must be signed in to change notification settings - Fork 2.2k
[BUG]: invalid shared ptr conversion leads to crash? #4365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is a known bug with multiple virtual bases unfortunately. See #3514 |
@Skylion007 thanks for the pointer, I left a comment there. |
That's a great analysis that matches what I found some time ago: I can only look more closely later. The smart_holder code does not use that
Then recompile and run your reproducer again. Does it work? |
@rwgk thanks for the reply! I checked out the #include <iostream>
#include <memory>
#include <vector>
#include <pybind11/pybind11.h>
#include <pybind11/smart_holder.h>
struct Base0 {
virtual ~Base0() {}
};
using Base0Ptr = std::shared_ptr<Base0>;
struct Base1 {
virtual ~Base1() {}
std::vector<int> vec = {1, 2, 3, 4, 5};
};
using Base1Ptr = std::shared_ptr<Base1>;
struct Derived : Base1, Base0 {
virtual ~Derived() {}
};
using DerivedPtr = std::shared_ptr<Derived>;
PYBIND11_SMART_HOLDER_TYPE_CASTERS(Base0)
PYBIND11_SMART_HOLDER_TYPE_CASTERS(Base1)
PYBIND11_SMART_HOLDER_TYPE_CASTERS(Derived)
PYBIND11_MODULE(example, m) {
pybind11::classh<Base0> bs0(m, "Base0");
pybind11::classh<Base1> bs1(m, "Base1");
pybind11::classh<Derived, Base0, Base1>(m, "Derived").def(pybind11::init<>());
m.def("make_object", [](int value) -> Base0Ptr {
auto ret_der = std::make_shared<Derived>();
std::cout << "ret der ptr: " << ret_der.get() << std::endl;
auto ret = Base0Ptr(ret_der);
std::cout << "ret base ptr: " << ret.get() << std::endl;
return ret;
});
m.def("print_object", [](const DerivedPtr &object) {
std::cout << "der ptr: " << object.get() << std::endl;
std::cout << object->vec.size() << std::endl;
});
} Unfortunately, the behaviour has not changed and I still see the same problem. Am I doing something wrong? |
I need to find a block of time to look at this carefully. Your reproducer looks valid at first sight. It would give me a head start if you could transfer it to a pull request:
Then git push, follow the link shown by git push to create the PR. That will trigger the CI. |
Thanks for pointing out @Skylion007, this completely fell off my radar. I forgot what I was thinking when I was looking back then, maybe it was that the reproducer is not easy do understand. What's now under #4347 looks like an easier starting point for debugging. |
Required prerequisites
What version (or hash if on master) of pybind11 are you using?
2.10.1
Problem description
We are experiencing a crash in the exposition of a C++ class hierarchy which seems to be related to conversions between
std::shared_ptr
s within the class hierarchy. The attached minimal code is enough to reproduce the issue. Let me give a brief description.On the C++ side, we have to base classes,
Base0
andBase1
, and a derived classDerived
which inherits from the two bases. All classes are exposed via pybind11 usingstd::shared_ptr
holder types, and the exposition ofDerived
signals thatBase0
andBase1
are its bases.Then, we expose a factory function
make_object()
which internally creates a shared ptr toDerived
, converts it tostd::shared_ptr< Base0 >
, and then returns it. In the implementation ofmake_object()
, we print to screen the memory addresses of the base and derived objects (more on that later).Finally, we expose a
print_object()
function that takes in input a pointer to derived and tries to print to screen the size of its vector data member. It also prints to screen the pointer contained in the input shared pointer.Now for the Python code:
So there's something obviously going wrong here, since:
23501544953261
elements, and0x557ad38c7a50
is a pointer toBase0
inmake_object()
, but it has magically been converted into a pointer toDerived
inprint_object(o)
, even though casting betweenBase0
andDerived
should change the memory address in this hierarchy (as evidenced in the pointer values printed frommake_object()
).In order to confirm that something fishy is going on, we can recompile the module with the address sanitizer on:
Then we need to take care of invoking Python with the address sanitizer preloaded:
(Note: this is from a conda installation, the
libasan.so
path should be adjusted as needed)We can the rerun the code:
Thus it definitely looks like a wrong memory access.
I have tried to delve into the pybind11 source with a debugger to see what might be going on. I can't say I have fully understood the logic, but it seems to me that:
print_object()
is of typeDerived
(I think this happens here: https://github.com/pybind/pybind11/blob/master/include/pybind11/detail/type_caster_base.h#L693), and thusreinterpret_cast
ashared_ptr<Base0>
toshared_ptr<Derived>
, which would be undefined behaviour in any case but works in practice if the pointers toBase0
andDerived
coincide (which, in this case, they do NOT).Please let me know if I can assist with more testing/triaging.
Reproducible example code
Is this a regression? Put the last known working version here if it is.
Not a regression
The text was updated successfully, but these errors were encountered: