-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] Shared building issues when linking on Linux (due to runpath/LD_LIBRARY_PATH issue)? #23421
Comments
Hi @irieger - thanks for reporting this issue. There are two aspects of this - what happens at link time (where cmake passes the I've been unable to reproduce the link-time issues:
I'm surprised by this, because the
Even using it like this, I'm unable to reproduce the link time issue, that is:
With a minimal CMakeLists like:
CMake is able to correctly build and link this, even when minizip (and libiconf) are shared libraries - that should be the case regardless, as far as I can see. Running the executable poses some issues finding libiconf, but that's a different (and relatively easy to solve) story. Do you have a reproducible example of being unable to link when building the executable? Bear in mind that That's why you're seeing things like:
being resolved correctly (likely a direct dependency of your executable, where the library is found in one of the but you're also getting:
these should come from conan, and NOT the system as is your case or this
this is because the The reason this is happening is that the linker is embedding the rpath (as provided by CMake) in the I'd suggest trying to launch your executable again after activating Alternatively, you may try adding |
An example is here: https://github.com/irieger/cpp-playground/tree/main/ocio-conan-dynamic-linking |
In all this tests, I care about the build/link stage, not execution. For execution I either load the env or package everything and run patchelf to set the path accordingly. There everything works fine if I have the build stage hacked to work. (For one project I currently use the hack that I activate the runenv for building, so that due to the env-var of LD_LIBRARY_PATH the linker doesn't fail with the cascaded dependencies.) |
Thanks! |
Yeah, it is somewhat hacked. As the sample script running the docker shows, I only called |
Thanks @irieger - I've been able to reproduce this with your repository. I have a question: this only fails with the changes you have made to the note that this mirrors what is actually intended by the maintainers of |
Oh, interesting point. I tried to find a hint for minizip needing to be static as there is no clear description in the recipe why. But didn't saw or connected this part. On my side so far it compiled & linked, but the way I currently use OpenImageIO, I don't think I run into the codepaths which would use the library. (Compiled & linked with removing the static restriction plus setting the linking to public so that the library path is also given to the tools.) Also on Linux (at least on archlinux) the official OpenColorIO links dynamically to I'd like to understand if it is really needed to enforce the static & will also confirm with the maintainer who I'm chatting with in the OCIO slack channel. One of the main problems is that this means no shared CI build for this and expecially for other packages depending on OCIO, which is for example OpenImageIO, which also has the same linking issues. So CI will miss all linking problems in everything using this as the shared builds are skipped. With both OpenColorIO & OpenImageIO fixed to build fully shared, I have so far not run into any runtime issue in the parts I use. |
So I got feedback that it isn't intended to enfore static and was also hinted that also brew normally dynamcially links. |
I reproduce similar issue (https://github.com/EstebanDugueperoux2/openimageio_shared_issue) and got also errors with your PR. Regards. |
Thanks for taking a look.
With that I got OpenColorIO building with everything shared. Not OpenImageIO. That needed a comparable change. I have a branch where I added #23112 basically as well as changing more or less all Any idea how we could properly improve that or what the underlying problem might be? Why is a linker even caring what a so that is being linked into my consumer is consuming itself? Always assumed the linker just tries to solve any of the external symbols of "my" code, so why does it need to resolve things that library is already resolved against? I'd assume this work was already done. Would really like to learn a bit more also while obviously also I'd like to have it properly solved upstream if it makes sense to reduce my hacks. |
When linking an executable (and interestingly, doesnt happen when creating shared libraries), the BFD linker (the default linker) will try to locate the dependencies of your dependencies, sort of replicating the behaviour of the runtime loader. This only happens with the BFD linker as fas as Im aware. On macOS and Windows, the behaviour is what you describe. On Linux, even other linkers don't perform this search, if I remember correctly. So chances are if you use the gold linker, According to the The relevant parts of the
What happens usually in CMake projects is that if CMake has enough information about transitive dependencies of your dependencies, it will pass If you want to try and work around this:
note that in this scenario, if CMake doesn't embed the RPATH into the executables, then the only way to run these executables is by using the But you are absolutely right, we need to locate the source of the problem :) - which I suspect is that CMake does not have the right information at the right time to pass the correct -rpath/-rpath-link to the linker. |
Hi @jcar87, thanks for your analysis and your proposed workaround. ` tools.build:sharedlinkflags = ["-fuse-ld=gold"] Regards. |
Thanks @EstebanDugueperoux2 - glad to hear this works, as it confirms the issue. For what it's worth, I suspect the linker will still produce exactly the same files (assuming the gold linker is able to). Will keep this issue open as this needs to be investigated properly in the Conan side. |
So I just thought I do try to patch the OpenColorIO with a new MR reduced to the minimum of needed changes and left the old one as is. Funnily, it works with Conan 1 to link everything but not Conan 2. Maybe that helps. Hadn't looked so close before but was just confused getting a CI pass for Conan 1: |
Just a general note: As I think I had mentioned a while ago in Slack, I followed @EstebanDugueperoux2 advice and run with mold as the linker in my profile and my stuff already works perfectly well locally with my local build pipeline since somewhere in May. So at least that solved it for me although I prefer to run as much as possible on upstream rather than needing to rebase my local changes to recipes. |
What is your question?
Hey,
I some time ago run into a problem building several shared libraries of larger projects building more than one artifact. The problem I run into - after hours of debugging - I think I condensed down to a core issue, that I'm trying to understand.
So the question is basically: How does shared linking and LD_LIBRARY_PATH / CMAKE_LIBRARY_PATH work?
To describe the problem on a concrete example, I can reference this PR: #23112
Here I try to give a brief example CMakeLists.txt excerpt, that roughly I think describes the problem:
Running
conan create ...
with this CMake results in a problem when linking targetexe
. Here is the corresponding output:Doing some analysis with
ldd
,readelf -d
etc. I found that liba.so (in the real caselibOpenColorIO.so.2.3.2
) finds all it dependencies but libiconv.so.2. Output of ldd:So the RUNPATH contains the directory for iconv (/root/.conan2/p/b/libic16d33c41345b9/p/lib). But as it isn't a direct dependency, that obviously doesn't help. And
/root/.conan2/p/b/miniz3c27d6cd524ad/p/lib/libminizip-ng.so.4
- which is from a package folder - has the runpath stripped.Now when we link exe against liba.so, liba.so can find libminizip-ng.so.4 thanks to the runpath, but libminizip-ng.so.4 can't solve its calls to libiconv.so.
How can this be solved properly? I get it in a hacky way via doing either one of the two following changes to the CMakeLists.txt:
target_link_libraries(a PRIVATE minizip-ng)
totarget_link_libraries(a PUBLIC minizip-ng)
target_link_libraries(exe PRIVATE a)
totarget_link_libraries(exe PRIVATE a minizip-ng)
This will result in the package building correctly.
I did some similar changes in a branch to OpenColorIO & OpenImageIO - which I both consume - and run my local build with this branch instead of CCI-master. With this change both projects build correctly which the didn't before. But I think that is not the actual solution, isn't it?
The conan_toolchain.cmake (for conan create recipes/opencolorio/all --version..) would look like this, so I'd assume that the search path is handed to the linker, but it isn't:
P.S. I play around with that in a docker. Here is some sample code I use, although currently I did a large set of changes locally to debug but the change in the repo should show the actual problem. It needs to use modifications of the opencolorio recipe that actually allows shared building.
https://github.com/irieger/cpp-playground/tree/main/ocio-conan-dynamic-linking
The text was updated successfully, but these errors were encountered: