Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xcode 16 + python + multiple shared libraries + dynamic_cast ==> fail #22204

Open
rpoyner-tri opened this issue Nov 16, 2024 · 4 comments
Open
Assignees
Labels
component: build system Bazel, CMake, dependencies, memory checkers, linters configuration: mac type: bug

Comments

@rpoyner-tri
Copy link
Contributor

What happened?

On macos/xcode 16:

$ bazel test //bindings/pydrake/systems:py/custom_test

fails with a std::bad_cast exception. Similarly

  • //examples/acrobot:py/spong_sim_lib_py_test
  • //examples/acrobot:py/spong_sim_main_py_test
  • //bindings/pydrake/examples:py/acrobot_test

A full CI build log: https://drake-jenkins.csail.mit.edu/view/Mac%20Sequoia%20Unprovisioned/job/mac-arm-sequoia-unprovisioned-clang-bazel-experimental-release/13/consoleFull

Version

master circa 1.35

What operating system are you using?

macOS 14 (Sonoma)

What installation option are you using?

compiled from source code using Bazel

Relevant log output

No response

@rpoyner-tri
Copy link
Contributor Author

On my dev branch, with extra instrumentation, we can see that there are two addresses that contain the same type descriptor:

ricopoyner@TRI-X9DWTVD9TR drake % bazel test //bindings/pydrake/systems:py/custom_test
INFO: Analyzed target //bindings/pydrake/systems:py/custom_test (1 packages loaded, 16 targets configured).
INFO: From Linking bindings/pydrake/systems/test/test_util.cpython-312-darwin.so:
ld: warning: duplicate -rpath '/opt/homebrew/Cellar/fmt/11.0.2/lib' ignored
FAIL: //bindings/pydrake/systems:py/custom_test (see /private/var/tmp/_bazel_ricopoyner/27b47a6d9b400570878eb2115555e985/execroot/drake/bazel-out/darwin_arm64-opt/testlogs/bindings/pydrake/systems/py/custom_test/test.log)
INFO: From Testing //bindings/pydrake/systems:py/custom_test:
==================== Test output for //bindings/pydrake/systems:py/custom_test:

Running tests...
----------------------------------------------------------------------
....E.............
======================================================================
ERROR [0.004s]: test_all_leaf_system_overrides (custom_test.TestCustom.test_all_leaf_system_overrides)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/private/var/tmp/_bazel_ricopoyner/27b47a6d9b400570878eb2115555e985/sandbox/darwin-sandbox/194/execroot/drake/bazel-out/darwin_arm64-opt/bin/bindings/pydrake/systems/py/custom_test.runfiles/drake/bindings/pydrake/systems/test/custom_test.py", line 584, in test_all_leaf_system_overrides
    results = call_leaf_system_overrides(system)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: is_dynamic_castable<drake::systems::LeafEventCollection<drake::systems::PublishEvent<double>>@0x109bd91c0>(drake::systems::EventCollection<drake::systems::PublishEvent<double>>* ptr) failed because ptr is of dynamic type drake::systems::LeafEventCollection<drake::systems::PublishEvent<double>>@0x1039d09e0.

----------------------------------------------------------------------
Ran 18 tests in 0.030s

FAILED (errors=1)

Generating XML reports...
================================================================================
INFO: Found 1 test target...
Target //bindings/pydrake/systems:py/custom_test up-to-date:
  bazel-bin/bindings/pydrake/systems/py/custom_test
INFO: Elapsed time: 1.899s, Critical Path: 1.54s
INFO: 4 processes: 2 internal, 2 darwin-sandbox.
INFO: Build completed, 1 test FAILED, 4 total actions
//bindings/pydrake/systems:py/custom_test                                FAILED in 0.7s
  /private/var/tmp/_bazel_ricopoyner/27b47a6d9b400570878eb2115555e985/execroot/drake/bazel-out/darwin_arm64-opt/testlogs/bindings/pydrake/systems/py/custom_test/test.log

Executed 1 out of 1 test: 1 fails locally.

They are from two shared libraries: libdrake.so and bindings/pydrake/systems/test/test_util.cpython-312-darwin.so. This situation is no different than before, but with xcode 15 the tests passed. I believe that older implementations of dynamic_cast would use (or fall back to) type string comparison if the addresses did not match. This appears to be no longer the case.

I've tried a lot of voodoo recommended by the interwebs (RTLD_GLOBAL, clang type_visibility attribute, ld -flat_namespace, etc.) to no avail. I suspect our choices boil down to:

  • avoid/replace/reimplement dynamic_cast
  • re-architect .so linking to avoid duplicate symbols
  • something else?

@rpoyner-tri
Copy link
Contributor Author

Along the lines of #22205, I'm working to identify and patch the relatively few dynamic_cast invocations that actually cause failures in the xcode 16 current build. I'll turn up with PR when things are passing.

@rpoyner-tri
Copy link
Contributor Author

Nope. Nah. Nevermind. Removing dynamic_casts is neither correct nor sustainable.

I did some more reading of llvmorg-project changes. It turns we probably instead want --copt=-fno-assume-unique-vtables.

@rpoyner-tri
Copy link
Contributor Author

#22227

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: build system Bazel, CMake, dependencies, memory checkers, linters configuration: mac type: bug
Projects
Status: In Progress
Development

No branches or pull requests

2 participants