Skip to content

Added test case for visibility of common symbols across shared libraries #5700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

petersteneteg
Copy link

@petersteneteg petersteneteg commented May 27, 2025

Description

See issue #5696

Suggested changelog entry:

  • Placeholder.

📚 Documentation preview 📚: https://pybind11--5700.org.readthedocs.build/

@petersteneteg petersteneteg requested a review from henryiii as a code owner May 27, 2025 13:18
@henryiii
Copy link
Collaborator

henryiii commented May 27, 2025

Should this be passing? Maybe you need to add it to a few of the ci jobs?

if("${PYTHON_MODULE_EXTENSION}" MATCHES "pypy"
OR "${Python_INTERPRETER_ID}" STREQUAL "PyPy"
OR "${PYTHON_MODULE_EXTENSION}" MATCHES "graalpy")
message(STATUS "Skipping embed test on PyPy or GraalPy")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
message(STATUS "Skipping embed test on PyPy or GraalPy")
message(STATUS "Skipping visibility test on PyPy or GraalPy")

@petersteneteg
Copy link
Author

Hi, yes it should not pass at the moment. I assumed that everything that "check" depended on would run automatically
But I seen now that you add those individually to the ci? I guess It should be added basically everywhere where pptest or test_cmake_build in ci.yml?

@henryiii
Copy link
Collaborator

Perfect, now can you add the fix?

@rwgk
Copy link
Collaborator

rwgk commented May 27, 2025

It only fails under macOS, interesting.

I'm still pretty worried about exporting on all platforms, but just exporting under macOS seems fine.

@henryiii
Copy link
Collaborator

If it needs exporting, I'd export it on all platforms. The settings might be different, but if a symbol is needed, I think it is best to ensure it is always available.

@rwgk
Copy link
Collaborator

rwgk commented May 27, 2025

If it needs exporting, I'd export it on all platforms. The settings might be different, but if a symbol is needed, I think it is best to ensure it is always available.

I'd be very uncomfortable doing that on all platforms, in particular on Windows, without fully understanding what exactly happens if the modules are not ABI compatible. — Developing that full understanding is probably hard (could take days of effort). Linux, macOS, Windows are all different in both fundamental and subtle ways.

To the best of my understanding: Exporting symbols is akin to piercing a hole into the ABI isolation layers.

@henryiii
Copy link
Collaborator

henryiii commented May 27, 2025

Could you try setting CMAKE_CXX_VISIBILITY_PRESET to hidden (or CXX_VISIBILTY_PRESET on the target)? We set it automatically for extensions, but I bet this test is missing that. I have a feeling everything is visible by default on the other platforms.

(Also, while you are at it, feel free to replace "ON" with "hidden" in tools/pybind11Config.cmake.in:97 - you don't have to, but that was a mistake in the docs!)

@petersteneteg
Copy link
Author

@rwgk I was also surprised that it didn't fail on with other clang builds.

After some digging in libc++ it seems there are 3 different implementations for type_info equal.

https://github.com/llvm/llvm-project/blob/a9b64bb3180dab6d28bf800a641f9a9ad54d2c0c/libcxx/include/typeinfo#L271-L276

The same file explains the differences
https://github.com/llvm/llvm-project/blob/a9b64bb3180dab6d28bf800a641f9a9ad54d2c0c/libcxx/include/typeinfo#L120-L187

Apple arm64 uses the third option. And of note here is:

types declared with hidden visibility are always considered to have
// a unique RTTI: the RTTI is emitted with linkonce_odr linkage and is assumed
// to be deduplicated by the linker within the linked image. Across linked image
// boundaries, such types are thus considered different types.

hence guarded_delete will get a unique and different id in each module.

Which aligns with what we have seen here.

@petersteneteg
Copy link
Author

@rwgk I think you might be overly careful here. We only need to export the guarded_delete, and as long as that class stays ABI compatible I don't think it in it self would be problem.

What maybe? could be a problem is when mixing a version with an exported guarded_delete and one without having it exported.
Which make the case that it would be good to add such fix into 3.0 and not later...

@henryiii
About the fix, I noticed that if I include "common.h" where PYBIND11_EXPORT is defined into "struct_smart_holder.h" I run into include issues in "wrap_include_python_h.h", complaining that "Python.h" file not found.
Do you know of an easy fix for this?

@henryiii
Copy link
Collaborator

Maybe we should pull out the macro into a new detail/macros.h or similar? Just the parts that are not dependent on Python. That's what pybind11_namespace_macros.h was, probably.

@rwgk
Copy link
Collaborator

rwgk commented May 27, 2025

Up to you guys, but I've generally been faring best with: If it's not broken (Linux, Windows), don't "fix" it. — You're definitely playing with ABI compatibility fire. Will it burn someone or not, idk.

@henryiii
Copy link
Collaborator

AFACT, this is only "working" other places due to an implementation detail. If an implementation of type_info decides to care (and it can), then this breaks unless it's exported.

@rwgk
Copy link
Collaborator

rwgk commented May 27, 2025

This is what we have:

$ git grep PYBIND11_EXPORT include/
include/pybind11/detail/common.h:#if !defined(PYBIND11_EXPORT)
include/pybind11/detail/common.h:#        define PYBIND11_EXPORT __declspec(dllexport)
include/pybind11/detail/common.h:#        define PYBIND11_EXPORT __attribute__((visibility("default")))
include/pybind11/detail/common.h:#if !defined(PYBIND11_EXPORT_EXCEPTION)
include/pybind11/detail/common.h:#        define PYBIND11_EXPORT_EXCEPTION PYBIND11_EXPORT
include/pybind11/detail/common.h:#        define PYBIND11_EXPORT_EXCEPTION
include/pybind11/detail/common.h:    extern "C" PYBIND11_MAYBE_UNUSED PYBIND11_EXPORT PyObject *PyInit_##name();
include/pybind11/detail/common.h:    extern "C" PYBIND11_EXPORT PyObject *PyInit_##name()
include/pybind11/detail/common.h:class PYBIND11_EXPORT_EXCEPTION builtin_exception : public std::runtime_error {
include/pybind11/detail/common.h:    class PYBIND11_EXPORT_EXCEPTION name : public builtin_exception {                             \
include/pybind11/pytypes.h:class PYBIND11_EXPORT_EXCEPTION error_already_set : public std::exception {

Note:

  • Currently PYBIND11_EXPORT is only used for PyInit_##name().

  • There is no existing PYBIND11_EXPORT for a struct or class.

We only have this for struct or class:

// For libc++, the exceptions should be exported,                               
// otherwise, the exception translation would be incorrect.                     
// IMPORTANT: This code block must stay BELOW the #include <exception> above (see PR #5390).
#if !defined(PYBIND11_EXPORT_EXCEPTION)                                         
#    if defined(_LIBCPP_EXCEPTION)                                              
#        define PYBIND11_EXPORT_EXCEPTION PYBIND11_EXPORT                       
#    else                                                                       
#        define PYBIND11_EXPORT_EXCEPTION                                       
#    endif                                                                      
#endif                                                                          

It seems to be exactly what's needed to fix the failing tests.

To not add a new twist, I recommend sticking to that approach.

Please give me a moment for trying out a prototype fix here.

@@ -58,6 +58,18 @@ High-level aspects:
#include <typeinfo>
#include <utility>

// IMPORTANT: This code block must stay BELOW the #include <stdexcept> above.
#if !defined(PYBIND11_EXPORT_GUARDED_DELETE)
# if defined(_LIBCPP_EXCEPTION)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this have to do with exceptions? Copy-paste error?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's intentional at the moment, for the purpose of experimenting, mimicking this existing logic:

// For libc++, the exceptions should be exported,
// otherwise, the exception translation would be incorrect.
// IMPORTANT: This code block must stay BELOW the #include <exception> above (see PR #5390).
#if !defined(PYBIND11_EXPORT_EXCEPTION)
# if defined(_LIBCPP_EXCEPTION)
# define PYBIND11_EXPORT_EXCEPTION PYBIND11_EXPORT
# else
# define PYBIND11_EXPORT_EXCEPTION
# endif
#endif

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass.

I believe this is the right direction: conservative / in keeping with existing code.

Now, ideally we'd have a centralized decision that applies to both PYBIND11_EXPORT_EXCEPTION and PYBIND11_EXPORT_GUARDED_DELETE. I don't know what I'd call the central macro though, in large part because I only have very sketchy ideas what's going on under the hood. Something along the lines of: libc++ doesn't handle cross-dynamic-library type_info very well?

It's also unfortunate that we need to #include <exception> or #include <stdexcept> in order to make that decision. Which we shouldn't do before #include <pybind11/conduit/wrap_include_python_h.h> ... UGH.

Copy link
Collaborator

@henryiii henryiii May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it's still an implementation detail - libc++ has three implementations; one of them requires this exported, so I think the correct choice is to export it, rather than relying on this implementation detail. What happens if libstdc++ decides to switch to a similar implementation? What happens if you use libc++ on Windows?

Test pass because this is exactly the same as before; this is always true when we compile against libc++; we do not support disabling exceptions in pybind11. We don't have a Windows libc++ compile. All our macOS builds use libc++, (I'm pretty sure) all our Linux builds use libstdc++.

We can't include anything from the stdlib before Python.h, that is completely unsupported by CPython. But this isn't changing the includes, though?

And __libcpp_version would be the standard way to detect libc++, I think.

@henryiii
Copy link
Collaborator

henryiii commented May 27, 2025

test_run_in_process_multiple_threads_parallel[test_cross_module_gil_acquired] great, looks like I was too careful in selecting GIL tests to disable on free-threaded. Edit: no, it's just not triggering. Maybe pytest enables the GIL in collection?

@rwgk
Copy link
Collaborator

rwgk commented May 27, 2025

test_run_in_process_multiple_threads_parallel[test_cross_module_gil_acquired] great, looks like I was too careful in selecting GIL tests to disable on free-threaded. Edit: no, it's just not triggering. Maybe pytest enables the GIL in collection?

See here:

https://chatgpt.com/share/683636ac-48dc-8008-9cf4-0860d29a4f44

Be careful though, I've seen ChatGPT go wrong even if referenced source code: in that case it misunderstood the source code. But of course, possibly it is correct here?

@henryiii
Copy link
Collaborator

That's what it seems like. I'm surprised pytest would enable the GIL to collect tests, but it's reporting as enabled during collection (not during the actual test runs).

@henryiii henryiii requested a review from Copilot May 28, 2025 05:17
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new test suite to verify that shared libraries export common symbols correctly, ensuring shared_ptr deleters match across modules.

  • Adds a standalone visibility test under tests/test_visibility with CMake integration and CI hooks
  • Exports the guarded_delete struct in struct_smart_holder.h to give it default visibility
  • Updates CI workflows to invoke the new test_visibility target on all build configurations

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/test_visibility/test_visibility.cpp New Catch-based tests for simple vs. alias-backed Python bindings
tests/test_visibility/lib.h Header defining lib::Base and lib::Foo with export macros
tests/test_visibility/lib.cpp Implementation of Base and Foo
tests/test_visibility/bindings.cpp Pybind11 bindings (typo in class registration)
tests/test_visibility/catch.cpp Custom Catch runner with embedded Python interpreter
tests/test_visibility/CMakeLists.txt CMake rules to build and link the visibility test suite
tests/CMakeLists.txt Added test_visibility subdirectory to main test tree
include/pybind11/detail/struct_smart_holder.h Added PYBIND11_EXPORT_GUARDED_DELETE to guarded_delete for visibility
.github/workflows/ci.yml Added test_visibility invocations across all CI job steps
Comments suppressed due to low confidence (3)

tests/test_visibility/test_visibility.cpp:27

  • [nitpick] The variable name holder is ambiguous—consider renaming it to foo_holder or foo_instance for clarity.
auto holder = bindings.attr("get_foo")(1, 2);

tests/test_visibility/test_visibility.cpp:40

  • [nitpick] The variable name holder2 is ambiguous—consider renaming it to bar_holder or bar_instance for clarity.
auto holder2 = main.attr("get_bar")(1, 2);

tests/test_visibility/bindings.cpp:12

  • Typo in class registration: pybind11::classh should be pybind11::class_ to compile correctly.
pybind11::classh<lib::Base, BaseTrampoline>(m, "Base")

@petersteneteg
Copy link
Author

@rwgk about

libc++ doesn't handle cross-dynamic-library type_info very well?
From what I have learnt the C++ standard does not mandate anything when it comes to shared libraries. According to the c++ abstract machine there are no shared libraries. A fully static build would probably be what is closest to what the abstract machine standardizes. How do deal with shared libraries (if at all), are defined by the vendor, And for a long time that was mostly only the Itanium ABI and the windows ABI, which made it almost feel like a kind of standard. Now with various arm implementations etc there are many more alternatives to consider.
The but the point it that how libc++ handle cross-dynamic-library type_info is to a very large degree up to them. Same as how they also decide on the semantics for the "export" pragmas. And there is not much we can do about that. We just have to do as they decide.

I think that from a conceptual point of view any class that is part of the public interface should generally be exported.
If you want to ensure ABI compatibility, you should not have any such classes in the public API.
All the template classes are basically "exported" automatically but in most cases that might will be fine, since they are instantiated for types that are unique to each module. But if one would instantiate them for a std type for example, we have the exact same issue.

Considering what is standard practice for exporting symbols it seems a bit excessive to use PYBIND11_EXPORT_GUARDED_DELETE and PYBIND11_EXPORT_EXCEPTION when PYBIND11_EXPORT would be fine and inline with what is generally used.

When reading PYBIND11_EXPORT_GUARDED_DELETE and PYBIND11_EXPORT_EXCEPTION I assume that there are something more complicated going on then just regular PYBIND11_EXPORT which I find misleading and confusing.

// IMPORTANT: This code block must stay BELOW the #include <stdexcept> above.
#if !defined(PYBIND11_EXPORT_GUARDED_DELETE)
# if defined(_LIBCPP_EXCEPTION)
# if defined(WIN32) || defined(_WIN32)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if one uses libc++ on windows?

@rwgk
Copy link
Collaborator

rwgk commented May 28, 2025

Here is a long chat specific to this PR:

TL;DR: I believe the "right thing" here is to move these into pybind11::detail::internals:

include/pybind11/detail/struct_smart_holder.h:        auto *vptr_del_ptr = std::get_deleter<guarded_delete>(vptr);
include/pybind11/detail/struct_smart_holder.h:        const auto *gd = std::get_deleter<guarded_delete>(vptr);
include/pybind11/detail/type_caster_base.h:        auto *vptr_gd_ptr = std::get_deleter<memory::guarded_delete>(holder().vptr);
include/pybind11/detail/type_caster_base.h:            auto *vptr_gd_ptr = std::get_deleter<memory::guarded_delete>(hld.vptr);
include/pybind11/detail/type_caster_base.h:            auto *sptsls_ptr = std::get_deleter<shared_ptr_trampoline_self_life_support>(hld.vptr);

@petersteneteg
Copy link
Author

petersteneteg commented May 28, 2025

I mean if this was not a header only library (why is it header only?), I would do all that in one TU and it would be fine.

Also I don't think that guarded_delete is really abi safe without the export either. Unless it has the same exact definition in all TUs it will break the one definition rule anyway. To deal with that you would need to version the symbol, like putting it in a v1 namespace and update that on every change to the class.

Edit: My thinking about the guarded_delete here was with respect to the C++ abstract machine spec. But that is not that relevant here, since the semantics here are all vendor defined and not specified in the c++ standard.

@henryiii
Copy link
Collaborator

It is header only due to simplicity in building. Today I'd much rather have an optional pre-compile step (similar to CLI11 and Catch2 these days); nanobind has a required one and people haven't had issues with it AFAIK. Someone almost contributed an option like that, but a lot other development was going on at the same time and they left their job before it was ready to put in.

@rwgk
Copy link
Collaborator

rwgk commented May 28, 2025

Also I don't think that guarded_delete is really abi safe without the export either.

@petersteneteg — I believe this statement should be revised or removed, as it gives a misleading impression.

pybind11 deliberately uses multiple layers of ABI isolation — including namespace visibility attributes, -fvisibility=hidden, and RTLD_LOCAL — to ensure that internals remain confined to each extension unless explicitly shared. These mechanisms work together across platforms to prevent unintended symbol or RTTI sharing.

Carefully controlled sharing is only enabled via the pybind11::detail::internals system. This design is intentional and robust, and it’s the result of years of work. I’m making an effort here to provide a fix that aligns with those goals. Comments suggesting otherwise, even indirectly, risk undermining confidence in the architecture.

@petersteneteg
Copy link
Author

@rwgk: You are most likely right, and I have very limited understand of how the internals handle this. And I am fine with either solution.

@petersteneteg
Copy link
Author

@rwgk: Thinking more about the fact the a lot of this behavior is implementation defined, I think it makes sense to be conservative in where to apply the fix, as you suggested.

@henryiii
Copy link
Collaborator

TL;DR: I believe the "right thing" here is to move these into pybind11::detail::internals:

Once it's ready, we can try to get it in. I'll probably go ahead and make an RC without it, though.

@rwgk
Copy link
Collaborator

rwgk commented May 29, 2025

TL;DR: I believe the "right thing" here is to move these into pybind11::detail::internals:

Once it's ready, we can try to get it in. I'll probably go ahead and make an RC without it, though.

I don't think I'll get to it before the weekend. This could take me a couple of weekends to complete.

It's not a regression, so making the 3.0.0 release sounds fine to me. The main trouble is that we'll have to bump the internals version again. The conduit feature will help, but I think a fraction of use cases will still be affected by the internals version bump (what @rhaschke found).

@henryiii
Copy link
Collaborator

It's not a regression

So is the plan to do a minimal fix like this for 3.0, then move things to internals and bump the internals for 3.1? I was rather hoping after a forced internals bump in 3.0 we'd not have to bump it for a while, but we can do that.

@rwgk
Copy link
Collaborator

rwgk commented May 29, 2025

So is the plan to do a minimal fix like this for 3.0

That would seem OK to me.

I was rather hoping after a forced internals bump in 3.0 we'd not have to bump it for a while

Me too.

Unfortunately I cannot jump on this immediately.

The options I see:

  • Minimal conservative fix (similar to what we have right now in this PR), internals version bump in 3.1.

  • Wait with the 3.0 release for another couple weeks until I get a chance to implement the internals integration, to avoid the internals version bump later.

  • Someone else doing the internals integration work? — It's a pretty narrowly defined task.

On the back of my mind, details for this PR:

  • test_visiblity is a very generic (uninformative) name, can we find a more specific name? — "cross-module RTTI" seems more to the point from a high-level perspective, the visibility aspect is more of an implementation detail in comparison.

  • Maybe it'll be a little more straightforward to use extending for the test (not embedding)? For easier long-term maintenance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants