-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🧑🌾 ros2cli pytest timeout in RHEL nightlies #932
Comments
I took a look into this, and this is almost certainly failing because of ros2/rclpy#1349 . Since that is such a large PR, it will be somewhat hard to track it down exactly, but with that reverted the failure stops happening for me. |
@clalancette @Crola1702 ros2/rclpy#1359 revert PR. |
Current SummaryThere are 2 possible PRs to generate this RHEL issue.
@Crola1702 if possible, can you check with current rolling if this issue is still happening with RHEL? i just want to be sure if either or both of them are generating this issue. |
Looking at nightly It still seems to be failing. So Revert "Executors types (#1345)" rclpy#1360) was not enough. |
@InvincibleRMC thanks, then let's revert ros2/rclpy#1359 |
@InvincibleRMC both ros2/rclpy#1359 and ros2/rclpy#1360 are reverted, hopefully RHEL nightly comes back with green light 🤞 |
Since I was able to reproduce the issue, I'm pretty sure that ros2/rclpy#1359 fixes this. I'm going to close this issue out, but feel free to reopen if it reoccurs. |
This is happening consistently in nightly_linux-rhel_release
Rhel release is consistent and Rhel debug is flaky |
I would try reverting ros2/rclpy#1338. It started failing the day after it got merged in. |
@InvincibleRMC yes, ros2/rclpy#1338 is the only commit since then. lets revert it. (I think that we should run all CI including RHEL debug for any type checking PR.) |
Sounds good. My apologies again. |
@InvincibleRMC @clalancette @Crola1702 according to the experimental test with RHEL Debug/Release, ros2/rclpy#1373 (comment) says https://github.com/ros2/rclpy/pull/1363/files is the one that generated this regression as a result. I just cannot explain why, if you have any idea, please let me know. here is what i am gonna do, 1st priority is get RHEL back to be stable and green. although i cannot explain the logics why RHEL does not like https://github.com/ros2/rclpy/pull/1363/files. that is true and what happens. So i will go ahead to revert ros2/rclpy#1363 and close ros2/rclpy#1373. any objections? |
That's very odd. Is there any chance you could run each of |
The segfault is strange. The "Current thread" number when the segfault message is printed matches the thread number given for the stack trace of the main thread. Does that meant the segfault happens in Python's threading library?
Console output
|
@InvincibleRMC We talked about this today. Although we are not fully sure what the root cause this is, it does seem there would be a bug somewhere else like flake8 or RHEL platform specific. with that and problem could be related to flake8 multi-threading, we decided to try flake8 in single-threaded configuration for CI ament/ament_lint#505. if ament/ament_lint#505 comes back all green, probably we do not need to revert things now. i will keep my eyes on that, and share the update later on. |
ament/ament_lint#505 worked like a silver bullet... 🦀 |
Closing this one as the test is not failing anymore |
This probably isn't related. But I noticed that the segfaults occur in |
🧑🌾 This started to happen reliably now, see: https://ci.ros2.org/job/nightly_linux-rhel_debug/2123/ and https://ci.ros2.org/job/nightly_linux-rhel_release/2098/. Re-opening. Probably ament/ament_lint#505 can fix this. |
Bug report
Required Info:
Steps to reproduce issue
Additional information
Reference build: https://ci.ros2.org/job/nightly_linux-rhel_release/2020/
Test failure:
Log output:
Seems ros2cli flake8 is taking a lot of time now. Failing since: #929, but probably not the root cause
Full log output:
The text was updated successfully, but these errors were encountered: