Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

manual_composition does not search for components in its own directory #424

Open
daixtrose opened this issue Jan 12, 2020 · 19 comments
Open
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@daixtrose
Copy link

daixtrose commented Jan 12, 2020

Bug report

Required Info:

  • Operating System:
    • Ubuntu 18.04
    • gcc -v
      Using built-in specs.
      COLLECT_GCC=gcc
      COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
      OFFLOAD_TARGET_NAMES=nvptx-none:hsa
      OFFLOAD_TARGET_DEFAULT=1
      Target: x86_64-linux-gnu
      Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.2.1-17ubuntu1~18.04.1' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
      Thread model: posix
      gcc version 9.2.1 20191102 (Ubuntu 9.2.1-17ubuntu1~18.04.1)
  • Installation type:
    • from source
  • Version or commit hash:
  • DDS implementation:
    • unknown
  • Client library (if applicable):
    • N/A

Steps to reproduce issue

git clone https://github.com/ros2/demos
cd demos/composition
mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH=/opt/ros/eloquent ..
make

Expected behavior

An executable manual_compositionwhich links to the freshly compiled component libraries.

Actual behavior

  • Output from ldd manual_composition
    linux-vdso.so.1 (0x00007ffd6a55f000)
    libtalker_component.so => /opt/ros/eloquent/lib/libtalker_component.so (0x00007fef69538000)
    liblistener_component.so => /opt/ros/eloquent/lib/liblistener_component.so (0x00007fef69311000)
    libserver_component.so => /opt/ros/eloquent/lib/libserver_component.so (0x00007fef69100000)
    libclient_component.so => /opt/ros/eloquent/lib/libclient_component.so (0x00007fef68ee9000)
    librclcpp.so => /opt/ros/eloquent/lib/librclcpp.so (0x00007fef68b8c000)
    librcutils.so => /opt/ros/eloquent/lib/librcutils.so (0x00007fef6897a000)
    [ ... many lines missing ... ]
    
  • Output from strace -o gaga -f ./manual_composition
    [ ... many lines missing ... ]
    70074 openat(AT_FDCWD, "/opt/ros/eloquent/lib/libtalker_component.so", O_RDONLY|O_CLOEXEC) = 3
    70074 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300\330\0\0\0\0\0\0"..., 832) = 832
    [ ... many lines missing ... ]
    

Additional information

None.

@daixtrose
Copy link
Author

IMHO the BUILD_RPATH needs a fix, but I am not 100% sure how to achieve this.

@daixtrose
Copy link
Author

Additional information from my investigation: It's not the RPATH. CMake should have reasonable defaults set. The output of chrpath -l manual_composition yields

manual_composition: RUNPATH=/my/path/to/demos/composition/build:/opt/ros/eloquent/lib:

strace output shows that this RUNPATH is used as search path for all other libraries, but not for the components. Not yet clear to me why.

@daixtrose
Copy link
Author

daixtrose commented Jan 13, 2020

I assumed that cmake was a first-class citizen among the build tools. Maybe I am wrong. The docs remain unclear to me concerning that question. With colcon things are OK, because a setup script in the install folder sets LD_LIBRARY_PATH, but IMHO the whole magic should work without setting environment variables at all.

@dirk-thomas dirk-thomas added the question Further information is requested label Jan 13, 2020
@dirk-thomas
Copy link
Member

Artifacts installed by CMake do not have an RPATH (it is being stripped during installation) but they rely on the environment variables like LD_LIBRARY_PATH to find dynamic libraries. Otherwise the results would e.g. not be relocatable.

You don't need to use colcon to setup environment variables for you. Any ament_cmake package already generates setup scripts (under share/<pkgname>) which extend the necessary environment variables.

If you choose a plain CMake build type you opt-out of those helpers provided by ament_cmake and if you also don't want to use colcon you need to provide similar functionality on your own.

@daixtrose
Copy link
Author

daixtrose commented Jan 14, 2020

AFAIK Artifacts installed by CMake do have an RPATH when set accordingly (keyword relocatable binaries). The problem is that the ament packages seem to overwrite additional RPATH/RUNPATH settings added to the CMake file. I am still investigating to find the culprit. I agree with you that without additional settings the RPATH is stripped off the executable during install. This can be checked in colcon's output in the install/composition/lib/composition directory. But without installing the RPATH is normally set such that the executable finds its libraries first. The output of chrpath -l manual_composition in directory build/composition yields
manual_composition: RUNPATH=[.. cut ...]/composition/build/composition:/opt/ros/eloquent/lib:
so I do not understand why the libs are not found, when running the executable from that directory.

I understand that I am on my own if I want to use plain CMake, but I fear some colcon or ament magic inhibits the things that worked for me in other context. But maybe I am blind or made a gross mistake.

Please note that opting out of colcon or ament is not for fun, but requiring correct environment variables set before program start is quite a problem in our safety-relevant context. I would really appreciate both approaches could live side by side.

@daixtrose
Copy link
Author

Thanks to Craig Scott I tend to think I have an explanation: The build system (cmake) does not set RPATH anymore, but RUNPATH, which in contrast to RPATH is only searched after the environment variable LD_LIBRARY_PATH. The fact that it is searched only after LD_LIBRARY_PATH was new to me.

Calling source /opt/ros/eloquent/setup.bash (which is required to have ament cmake macros to be found) yields 'LD_LIBRARY_PATHto be set to/opt/ros/eloquent/opt/yaml_cpp_vendor/lib:/opt/ros/eloquent/opt/rviz_ogre_vendor/lib:/opt/ros/eloquent/lib` which then wins during program startup.

A reset via export LD_LIBRARY_PATH= resolves the issue.

As long as ament macros are used in the CMakeLists.txt, a pure CMake approach is only achived if one extracts from setup.bash the parts that make CMake find the ament macros (which I could not figure out immediately) without setting other environment variables. Is there a way to set the paths to ament CMake macros without further shell environment settings?

@dirk-thomas
Copy link
Member

The problem is that the ament packages seem to overwrite additional RPATH/RUNPATH settings added to the CMake file.

I doubt that there is any logic in ament which does that. Please point to it when you find such a thing.

I fear some colcon or ament magic inhibits the things that worked for me in other context.

Again, I don't think there is any magic involved. You can easily introspect the command which are being invoked for every package in the log/latest_build/<pkgname>/commands.log files. It simply boils down to cmake --build / cmake --install with some arguments.

Please note that opting out of colcon or ament is not for fun, but requiring correct environment variables set before program start is quite a problem in our safety-relevant context. I would really appreciate both approaches could live side by side.

I don't see how the environment variables set by the setup files (which are necessary to use the artifacts) would interfere with your safety-relevant context. Maybe you can elaborate on that a bit more.

Calling source /opt/ros/eloquent/setup.bash (which is required to have ament cmake macros to be found) yields 'LD_LIBRARY_PATHto be set to/opt/ros/eloquent/opt/yaml_cpp_vendor/lib:/opt/ros/eloquent/opt/rviz_ogre_vendor/lib:/opt/ros/eloquent/lib` which then wins during program startup.

If your package provides a shared library installed into <prefix>/lib your package should also add that path to the LD_LIBRARY_PATH. Doing that I don't see a case where it shouldn't work as expected.

Is there a way to set the paths to ament CMake macros without further shell environment settings?

Can you clarify what environment variables you consider to be "ament CMake macros" and what "further shell environment settings"? The answer is probably no, the setup files set a bunch of environment variables and you can't cherry-pick (simply because there was never a need for it).

@daixtrose
Copy link
Author

The further shell environment settings that enable the cmake process to work properly on a fresh shell are

export AMENT_CURRENT_PREFIX=/opt/ros/eloquent
export AMENT_PREFIX_PATH=/opt/ros/eloquent
export PYTHONPATH=/opt/ros/eloquent/lib/python3.6/site-packages

A subsequent call to

mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH=/opt/ros/eloquent ..
make
export LD_LIBRARY_PATH=/opt/ros/eloquent/lib
./manual_composition

succeeds.

  • I am especially surprised that I have to set a python path for a cmake file. Yet another dependency.
  • Instead of setting LD_LIBRARY_PATH I would prefer a solution that sets the search paths in the RPATH/RUNPATH via cmake. Unfortunately I could not achieve this quickly. I tried
    set_target_properties(manual_composition 
    PROPERTIES
    CMAKE_INSTALL_RPATH "$ORIGIN:$ORIGIN/../lib:/opt/ros/eloquent/lib"
    CMAKE_BUILD_RPATH "$ORIGIN:$ORIGIN/../lib:/opt/ros/eloquent/lib"
    )
    but for reasons I haven't figured out the libraries are not searched in /opt/ros/eloquent/lib.
  • Note that ls /usr/bin | wc -l yields 2149 and none of these binaries makes use of LD_LIBRARY_PATH which could give you a hint why I expect the same from ROS executables

@daixtrose
Copy link
Author

daixtrose commented Jan 19, 2020

I don't see how the environment variables set by the setup files (which are necessary to use the artifacts) would interfere with your safety-relevant context. Maybe you can elaborate on that a bit more.

I want a single ROS executable to run as a daemon. The idea that I have to run it via a shell script is possible, but I would like to avoid any of these in favor of simplicity, and simplicity is a key factor here. In an ideal world I could copy a static executable onto a minimal linux and run it at startup. I fear ROS2 might stand in my way here.

Also I would like to point to https://www.hpc.dtu.dk/?page_id=1180, which states

Remember that the directories specified in LD_LIBRARY_PATH get searched before(!) the standard locations? In that way, a nasty person could get your application to load a version of a shared library that contains malicious code! That’s one reason why setuid/setgid executables do neglect that variable!

@daixtrose
Copy link
Author

daixtrose commented Jan 20, 2020

* but for reasons I haven't figured out the libraries are not searched in `/opt/ros/eloquent/lib`.

This is due to to another change in cmake: one has to repeat the following in every component:

set_target_properties(my_component_name 
  PROPERTIES
  BUILD_RPATH_USE_ORIGIN ON
  INSTALL_RPATH "$ORIGIN:$ORIGIN/../lib:/opt/ros/eloquent/lib"
  BUILD_RPATH "$ORIGIN:$ORIGIN/../lib:/opt/ros/eloquent/lib"
)

Now everything works fine until a ROS2 dependency is loaded. Since ROS2 does not set the RUNPATH to /opt/ros/eloquent/lib or $ORIGIN (which it easily could), recursive dependencies are not found and therefore there is no way around export LD_LIBRARY_PATH=/opt/ros/eloquent/lib before running any ROS2 executable, which is a showstopper, because now the wrong components are loaded again:

@dirk-thomas does anything speak against compiling ROS2 libraries installed to /opt/ros/eloquent/lib with a RUNPATH set to $ORIGIN?

@dirk-thomas
Copy link
Member

I am especially surprised that I have to set a python path for a cmake file. Yet another dependency.

You certainly won't need PYTHONPATH to execute a compiler artifact.

Note that ls /usr/bin | wc -l yields 2149 and none of these binaries makes use of LD_LIBRARY_PATH which could give you a hint why I expect the same from ROS executables

Because they are installed in a standard location which is implicitly being searched. Since the ROS executable are installed in a non-standard location in your case they need an environment variable to tell where to search for the libraries. This is simply how it works on Linux and is not specific to ROS.

I want a single ROS executable to run as a daemon. The idea that I have to run it via a shell script is possible, but I would like to avoid any of these in favor of simplicity, and simplicity is a key factor here. In an ideal world I could copy a static executable onto a minimal linux and run it at startup. I fear ROS2 might stand in my way here.

As I mentioned before: nothing in ROS is doing anything magic here. Until you can point to something in ROS which is responsible for it please don't call it "ROS2 might stand in my way". What you are seeing is standard CMake behavior.

Since ROS2 does not set the RUNPATH to /opt/ros/eloquent/lib or $ORIGIN (which it easily could), recursive dependencies are not found and therefore there is no way around export LD_LIBRARY_PATH=/opt/ros/eloquent/lib before running any ROS2 executable, which is a showstopper, because now the wrong components are loaded again:

In ROS we chose to set the LD_LIBRARY_PATH. And that is simply the only reason why RUNPATH is not set - there is no need for it.

Yes, we could modify the logic to set RUNPATH and use $ORIGIN (to still be relocatable - not like when using rpath). It would just be a modification to what CMake does by default. So adding a function which does the work for you and developer can opt-in when writing their packages CMake code sounds viable. Please feel free to contribute a pull request offering that alternative.

@daixtrose
Copy link
Author

I think I have a problem that stems from LD_LIBRARY_PATH. interfering with CPack. I will elaborate on it later, but you can preview my experiments at https://github.com/daixtrose/composition_sep/blob/master/README.md to get an idea. Run ldd on the installed executable to see the problem. Maybe there is a workaround that I am not aware of.

@daixtrose
Copy link
Author

I have not found a solution for this issue yet and I am unsure if a solution exists.

In ROS we chose to set the LD_LIBRARY_PATH. And that is simply the only reason why RUNPATH is not set - there is no need for it.

I understand that you prefer to have multiple ROS versions side by side and not interfere with each other by running setup scripts which set LD_LIBRARY_PATH.

Yet the price to pay is that a library must not have the same name as a library in /opt/ros/eloquent/lib, because in that case it will not be found, but rather its name sibling in LD_LIBRARY_PATH=/opt/ros/eloquent/lib - a directory containing libraries with some quite generic names like e.g. liblaser_geometry.so and might contain others with generic names in the future.

So this does not scale and since LD_LIBRARY_PATH is searched before RUNPATH it appears to me that there is no easy solution how to use create a ROS executable that finds its private libraries first (unless I manually add the private library path in front of LD_LIBRARY_PATH of course).

Please find attached the current version of my CPack experiment for future reference: composition_sep-master.zip

@dirk-thomas
Copy link
Member

Yet the price to pay is that a library must not have the same name as a library in /opt/ros/eloquent/lib, because in that case it will not be found, but rather its name sibling in LD_LIBRARY_PATH=/opt/ros/eloquent/lib - a directory containing libraries with some quite generic names like e.g. liblaser_geometry.so and might contain others with generic names in the future.

So this does not scale and since LD_LIBRARY_PATH is searched before RUNPATH it appears to me that there is no easy solution how to use create a ROS executable that finds its private libraries first (unless I manually add the private library path in front of LD_LIBRARY_PATH of course).

There is a very straight forward solution to these naming collisions: the name of a package already acts as a namespace (e.g. consider C/C++ header files or Python modules). A package foo basically claim the namespace foo so no other package should install header files / Python modules under foo. You can apply the same to shared libraries - in fact a lot of ROS packages do install a shared library which exactly matches the name of the package.

In an environment where multiple packages share the same namespace (which the case for headers, Python modules as well as shared libraries) that kind of consensus is simply necessary to avoid collisions.

@daixtrose
Copy link
Author

My findings summarized

The issue can only be mitigated by changing the general approach chosen by ROS, i.e. drop the idea that environment variables (especially LD_LIBRARY_PATH) are used to control the library search procedure.

The need to have side-by-side installations of different ROS versions can be met without requiring LD_LIBRARY_PATH to be set if - and only if - the RUNPATH is set via

set_target_properties(my_exec 
  BUILD_RPATH_USE_ORIGIN ON
  PROPERTIES INSTALL_RPATH 
  "$ORIGIN:$ORIGIN/../lib:/opt/ros/<whatever version>/lib:<probably other subdirs>")

in each ROS executable and each ROS library (use @loader_path instead of ORIGIN on MAC).

I still haven't found the best place to add this to the build process of all ROS2 libraries, so I cannot provide the pull request that @dirk-thomas asked for within a short time period.

Intermediate workaround

Without changing ROS2, the user is required to start the executable via a startup script that sets the LD_LIBRARY_PATH such that the libraries shipped with the executable are guaranteed to be found before ROS2 drops in. It all boils down to export LD_LIBRARY_PATH=${PATH_TO_EXECUTABLE}/../lib:${PATH_TO_ROS_LIBS}:${LD_LIBRARY_PATH}

The current workaround demonstrated here and attached to this issue: composition_sep-master.zip

My wishes summarized

Your mileage may vary, but I consider being forced to run an executable via a startup shell script in order to have it run and load libs correctly as unnecessary and I would appreciate an approach that does not depend on environment variables in general.

Also I kind of like to use CPack or other package managers to create Debian and RPM packages and I think this should not me made more difficult than it could be.

The same argument applies for being forced to set PYTHONPATH in order to have cmake run without errors. You are only 4 environment commands away from a pure cmake approach that allows building packages with CPack. That is a solvable problem.

@daixtrose
Copy link
Author

BTW: feel free to add composition_sep or derivations from it to the ROS2 examples. Once this issue is decided upon I can also provide a PR.

@dirk-thomas dirk-thomas added the help wanted Extra attention is needed label Feb 13, 2020
@dirk-thomas
Copy link
Member

My wishes summarized

The same argument applies for being forced to set PYTHONPATH in order to have cmake run without errors.

Please consider to make concrete proposals and also contribute necessary changes to achieve your goal of not requiring custom environment variables.

@daixtrose
Copy link
Author

Is there a central point where you add compiler flags for all ROS2 parts?

@dirk-thomas
Copy link
Member

Is there a central point where you add compiler flags for all ROS2 parts?

Since some packages are just plain CMake projects there is on central point. The best way to pass CMake arguments to all CMake packages is the command line / environment variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants