Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jazzy CI and build status #2851

Closed
wants to merge 32 commits into from
Closed

Conversation

henningkayser
Copy link
Member

No description provided.

@henningkayser henningkayser mentioned this pull request May 24, 2024
10 tasks
@henningkayser henningkayser requested review from sjahr and rhaschke May 24, 2024 17:15
Copy link
Contributor

@rhaschke rhaschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the tutorial images fail to build:
For humble, due to a broken generate_parameter_library:

/root/ws_moveit/build/moveit_kinematics/kdl_kinematics_plugin/kdl_kinematics_parameters/include/kdl_kinematics_parameters.hpp:146:36: error: expected unqualified-id before ‘.’ token
2522
  146 |       auto& entry = updated_params..joints_map[value_1];

For rolling due to missing package gazebo_ros2_control:

kortex_bringup: Cannot locate rosdep definition for [gazebo_ros2_control]

Comment on lines 33 to 34
- IMAGE: jazzy-ci-testing
ROS_DISTRO: jazzy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that ci and ci-testing are identical since #2793: Because osrf/ros2:testing (the base image of ci-testing) was broken upstream at the time, I disabled building it and aliased it to ci.

Suggested change
- IMAGE: jazzy-ci-testing
ROS_DISTRO: jazzy

@rhaschke
Copy link
Contributor

rhaschke commented May 26, 2024

I had a look into the open issues in MTC's CI - based on rolling docker images. I observed strange segfaults during shutdown of binaries. It turned out, that two ABI-incompatible versions of liboctomap are linked into moveit_core's libs (version 1.10.0 from ROS and 1.9.7 (via libfcl) from system). My suggestion to cleanly fix that issue, is to drop the ROS release of octomap and rely on the system version only (see ros/rosdistro#40273 (comment)).

If that proposal is not accepted, at least geometric_shapes and moveit_core should build against the system's octomap. To this end, we need to introduce a new rosdep key for liboctomap-dev.

To test my hypothesis that octomap was the culprit, I manually created a rolling-source image where I removed ros-rolling-octomap after running rosdep install. I also pushed that image to dockerhub, but this will be overwritten by the next scheduled build of the docker images. So, this needs short-term action!

Tests confirm that the segfaults are fixed: the number of failing tests reduced from 18 to 4 (actually only 2).
These two MTC tests that remain failing are both related to another shutdown issue:

cannot publish data, at ./src/rmw_publish.cpp:66 during '__function__'
Fail in delete datareader, at ./src/rmw_service.cpp:89 during '__function__'

Interestingly, Humble does not exhibit that issue. I'm afraid that's a regression in Rolling's rclcpp. No idea, where this originates from. And I need to turn to different tasks next... So, this stays unresolved for now. Any help is welcome ;-)

@henningkayser henningkayser self-assigned this Jun 6, 2024
@henningkayser
Copy link
Member Author

Once this succeeds, please merge with REBASE so that we keep the commit history. There are commits like dac7638 that should be reverted later on.

@henningkayser
Copy link
Member Author

Build all succeed now, we still get a bunch of segfaults in the unit test. In particular there is a double free in PILZ when testing under noble

    [unittest_trajectory_generator_ptp-1] [       OK ] TrajectoryGeneratorPTPTest.testJointGoalNoStartVel (9 ms)
    [unittest_trajectory_generator_ptp-1] [----------] 13 tests from TrajectoryGeneratorPTPTest (140 ms total)
    [unittest_trajectory_generator_ptp-1] 
    [unittest_trajectory_generator_ptp-1] [----------] Global test environment tear-down
    [unittest_trajectory_generator_ptp-1] [==========] 13 tests from 1 test suite ran. (140 ms total)
    [unittest_trajectory_generator_ptp-1] [  PASSED  ] 13 tests.
    [unittest_trajectory_generator_ptp-1] double free or corruption (fasttop)
Error: ROR] [unittest_trajectory_generator_ptp-1]: process has died [pid 35681, exit code -11, cmd '/home/runner/work/moveit2/moveit2/.work/target_ws/install/pilz_industrial_motion_planner/lib/pilz_industrial_motion_planner/unittest_trajectory_generator_ptp --ros-args -r __node:=unittest_trajectory_generator_ptp --params-file /tmp/launch_params_pvabl51l --params-file /tmp/launch_params_sw5qgu_0'].

https://github.com/moveit/moveit2/actions/runs/9402842379/job/25897891774?pr=2851#step:11:11618

@rhaschke
Copy link
Contributor

rhaschke commented Jun 6, 2024

there is a double free in PILZ when testing under noble

I have fixed that in MoveIt1 and just pushed the corresponding port.

Avoid double-free of KDL::RotationalInterpolation in KDL >= 1.4.1
The underlying memory leak was fixed in KDL with orocos/orocos_kinematics_dynamics#180
@rhaschke
Copy link
Contributor

rhaschke commented Jun 6, 2024

Unfortunately, you force-pushed and thus overwrote my commit for the Pilz double-free...

@henningkayser
Copy link
Member Author

Unfortunately, you force-pushed and thus overwrote my commit for the Pilz double-free...

Yeah, I just noticed this too late... Thank you, this helps a lot!

@henningkayser
Copy link
Member Author

I don't think 48f60c4 had anything to do with the double free, unfortunately. The same issue has been fixed a while ago in #1229 already. Oddly, the tests produce segfaults (in CI and also locally) with the compile guard, and since there is no active distro that still provides kdl <1.4.0 I'll remove this switch. All pilz tests pass locally for me without any issues on clang+jazzy with both of the two latest commits. However, the main issue left is that CI will still install the ros-octomap package, regardless if it's provided in the docker image or not. I don't think there is a way around a proper version guard in the cmake setup for geometric_shapes and moveit

This reverts commit 55df0bc.

The deprecated constructor was being used in the same file
for the exact use case of enabling namespaces that are not
specified by the parameter. There is no replacement for
supporting a dynamic server lookup, however the parameter
logic could still use simplification.
@henningkayser
Copy link
Member Author

well, at least jazzy is happy now.
tutorials will require fixing of ros2_kortex #2864 and the rolling jobs fail for spurious clang warnings and codecov mismatches.
So far moveit_msgs and geometric_shapes are already released, which also includes the version fix for octomap #2862.

@rhaschke
Copy link
Contributor

rhaschke commented Jun 9, 2024

rolling with clang-tidy should be happy now as well (there were several new fixes).
The only broken job remaining is the coverage one.
Note that I already reverted several temporary commits. Please cleanup the commit history before merging.

@henningkayser
Copy link
Member Author

rolling with clang-tidy should be happy now as well (there were several new fixes). The only broken job remaining is the coverage one. Note that I already reverted several temporary commits. Please cleanup the commit history before merging.

Thank you! I was planning to do a fresh cherry-pick of all the things that we keep, and squash it into concise blocks in a fresh branch.

@henningkayser
Copy link
Member Author

I'm switching off ccov for now (#2866) and will reopen this PR with a new branch to unblock CI and releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants