Skip to content
This repository has been archived by the owner on Feb 4, 2021. It is now read-only.

Composition test flakiness increase #118

Closed
dhood opened this issue May 9, 2018 · 2 comments
Closed

Composition test flakiness increase #118

dhood opened this issue May 9, 2018 · 2 comments

Comments

@dhood
Copy link
Member

dhood commented May 9, 2018

The composition tests have always been a bit flaky with fastrtps, but since April 21 they started failing regularly again, including with connext.

To get an idea of what causes them to timeout I ran a build that doesn't capture the stdout on timeout.

Example:

../../../../build/composition/test_api_pubsub_composition__rmw_connext_cpp_.py (test_api_composition) pid 5712: ['/home/rosbuild/ci_scripts/ws/build/composition/api_composition'] (all > console, InMemoryHandler: test_api_composition)
(load_talker_component) pid 5713: ['/home/rosbuild/ci_scripts/ws/build/composition/api_composition_cli', 'composition', 'composition::Talker'] (stderr > stdout, all > console)
(load_listener_component) pid 5714: ['/home/rosbuild/ci_scripts/ws/build/composition/api_composition_cli', 'composition', 'composition::Listener'] (stderr > stdout, all > console)
[test_api_composition] RTI Data Distribution Service Evaluation License issued to OSRF dthomas@osrfoundation.org For non-production use only.
[test_api_composition] Expires on 05-may-2018 See www.rti.com for more information.
[load_listener_component] RTI Data Distribution Service Evaluation License issued to OSRF dthomas@osrfoundation.org For non-production use only.
[load_listener_component] Expires on 05-may-2018 See www.rti.com for more information.
[load_talker_component] RTI Data Distribution Service Evaluation License issued to OSRF dthomas@osrfoundation.org For non-production use only.
[load_talker_component] Expires on 05-may-2018 See www.rti.com for more information.
[load_listener_component] [INFO] [api_composition_cli]: Sending request...
[load_listener_component] [INFO] [api_composition_cli]: Waiting for response...
[test_api_composition] [INFO] [api_composition]: Load library /home/rosbuild/ci_scripts/ws/install/composition/lib/liblistener_component.so
[test_api_composition] [INFO] [api_composition]: Instantiate class composition::Listener
[test_api_composition] RTI Data Distribution Service Evaluation License issued to OSRF dthomas@osrfoundation.org For non-production use only.
[test_api_composition] Expires on 05-may-2018 See www.rti.com for more information.
[load_talker_component] [INFO] [api_composition_cli]: Sending request...
[load_talker_component] [INFO] [api_composition_cli]: Waiting for response...
[test_api_composition] [INFO] [api_composition]: Load library /home/rosbuild/ci_scripts/ws/install/composition/lib/libtalker_component.so
[load_listener_component] [INFO] [api_composition_cli]: Result of load_node: success = true
[test_api_composition] [INFO] [api_composition]: Instantiate class composition::Talker
[test_api_composition] RTI Data Distribution Service Evaluation License issued to OSRF dthomas@osrfoundation.org For non-production use only.
[test_api_composition] Expires on 05-may-2018 See www.rti.com for more information.
[load_talker_component] [INFO] [api_composition_cli]: Result of load_node: success = true
[test_api_composition] [INFO] [talker]: Publishing: 'Hello World: 1'
[test_api_composition] [INFO] [listener]: I heard: [Hello World: 1]
[test_api_composition] signal_handler(2)
(load_listener_component) rc 0
(load_talker_component) rc 0
Test  #1: test_api_pubsub_composition__rmw_connext_cpp ..............***Timeout  60.00 sec

Seems to be an issue with launch not tearing down the nodes. test_api_composition did not respond to the signal properly. Haven't investigated what happened around that time to cause this change in behaviour though.

Expected behaviour:

13:26:00 1: ../../../../build/composition/test_api_pubsub_composition__rmw_connext_cpp_.py (test_api_composition) pid 5628: ['/home/rosbuild/ci_scripts/ws/build/composition/api_composition'] (all > console, InMemoryHandler: test_api_composition)
13:26:00 1: (load_talker_component) pid 5629: ['/home/rosbuild/ci_scripts/ws/build/composition/api_composition_cli', 'composition', 'composition::Talker'] (stderr > stdout, all > console)
13:26:00 1: (load_listener_component) pid 5630: ['/home/rosbuild/ci_scripts/ws/build/composition/api_composition_cli', 'composition', 'composition::Listener'] (stderr > stdout, all > console)
13:26:00 1: [load_talker_component] RTI Data Distribution Service Evaluation License issued to OSRF dthomas@osrfoundation.org For non-production use only.
13:26:00 1: [load_talker_component] Expires on 05-may-2018 See www.rti.com for more information.
13:26:00 1: [test_api_composition] RTI Data Distribution Service Evaluation License issued to OSRF dthomas@osrfoundation.org For non-production use only.
13:26:00 1: [test_api_composition] Expires on 05-may-2018 See www.rti.com for more information.
13:26:00 1: [load_listener_component] RTI Data Distribution Service Evaluation License issued to OSRF dthomas@osrfoundation.org For non-production use only.
13:26:00 1: [load_listener_component] Expires on 05-may-2018 See www.rti.com for more information.
13:26:00 1: [load_talker_component] [INFO] [api_composition_cli]: Sending request...
13:26:00 1: [load_talker_component] [INFO] [api_composition_cli]: Waiting for response...
13:26:00 1: [load_listener_component] [INFO] [api_composition_cli]: Sending request...
13:26:00 1: [load_listener_component] [INFO] [api_composition_cli]: Waiting for response...
13:26:00 1: [test_api_composition] [INFO] [api_composition]: Load library /home/rosbuild/ci_scripts/ws/install/composition/lib/liblistener_component.so
13:26:00 1: [test_api_composition] [INFO] [api_composition]: Instantiate class composition::Listener
13:26:00 1: [test_api_composition] RTI Data Distribution Service Evaluation License issued to OSRF dthomas@osrfoundation.org For non-production use only.
13:26:00 1: [test_api_composition] Expires on 05-may-2018 See www.rti.com for more information.
13:26:00 1: [load_listener_component] [INFO] [api_composition_cli]: Result of load_node: success = true
13:26:01 1: [load_talker_component] [INFO] [api_composition_cli]: Response not available, waiting again...
13:26:01 1: [test_api_composition] [INFO] [api_composition]: Load library /home/rosbuild/ci_scripts/ws/install/composition/lib/libtalker_component.so
13:26:01 1: [test_api_composition] [INFO] [api_composition]: Instantiate class composition::Talker
13:26:01 1: [test_api_composition] RTI Data Distribution Service Evaluation License issued to OSRF dthomas@osrfoundation.org For non-production use only.
13:26:01 1: [test_api_composition] Expires on 05-may-2018 See www.rti.com for more information.
13:26:01 1: [load_talker_component] [INFO] [api_composition_cli]: Result of load_node: success = true
13:26:02 1: [test_api_composition] [INFO] [talker]: Publishing: 'Hello World: 1'
13:26:02 1: [test_api_composition] [INFO] [listener]: I heard: [Hello World: 1]
13:26:02 1: [test_api_composition] signal_handler(2)
13:26:03 1: (load_listener_component) rc 0
13:26:04 1: (load_talker_component) rc 0
13:26:12 1: (test_api_composition) rc 0
13:26:12 1: () tear down
13:26:12 1: .
13:26:12 1: 
13:26:12 1:  generated xml file: /home/rosbuild/ci_scripts/ws/build/composition/test_results/composition/test_api_pubsub_composition__rmw_connext_cpp.xunit.xml 
13:26:12 1: ========================== 1 passed in 11.80 seconds ===========================
13:26:12 1: -- run_test.py: return code 0
13:26:12 1: -- run_test.py: verify result file '/home/rosbuild/ci_scripts/ws/build/composition/test_results/composition/test_api_pubsub_composition__rmw_connext_cpp.xunit.xml'
13:26:12       Test  #1: test_api_pubsub_composition__rmw_connext_cpp ..............   Passed   12.24 sec
@dhood
Copy link
Member Author

dhood commented May 23, 2018

This might be the same underlying cause for the flakiness of the test of the parameter events demo.

I ran a job to check if ros2/rclcpp#476 reduced flakiness in that test (which now I see was unlikely since that demo uses a low wait_for_service timeout, but anyway, might as well post the results). It times out because it hangs after shutdown is requested:

https://ci.ros2.org/job/ci_linux/4460/console

18:21:41 3: ../../../../build/demo_nodes_cpp/test_parameter_events_async__rmw_connext_cpp_.py (test_executable_0) pid 24778: ['/home/rosbuild/ci_scripts/ws/build/demo_nodes_cpp/parameter_events_async'] (all > console, InMemoryHandler: test_executable_0)
18:21:41 3: [test_executable_0] RTI Data Distribution Service Evaluation License issued to OSRF (OSRF01) dthomas@osrfoundation.org For non-production use only.
18:21:41 3: [test_executable_0] Expires on 5-Nov-2018 See www.rti.com for more information.
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] Parameter event:
18:21:41 3: [test_executable_0]  new parameters:
18:21:41 3: [test_executable_0]   foo
18:21:41 3: [test_executable_0]  changed parameters:
18:21:41 3: [test_executable_0]  deleted parameters:
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] Parameter event:
18:21:41 3: [test_executable_0]  new parameters:
18:21:41 3: [test_executable_0]   bar
18:21:41 3: [test_executable_0]  changed parameters:
18:21:41 3: [test_executable_0]  deleted parameters:
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] Parameter event:
18:21:41 3: [test_executable_0]  new parameters:
18:21:41 3: [test_executable_0]   baz
18:21:41 3: [test_executable_0]  changed parameters:
18:21:41 3: [test_executable_0]  deleted parameters:
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] Parameter event:
18:21:41 3: [test_executable_0]  new parameters:
18:21:41 3: [test_executable_0]   foobar
18:21:41 3: [test_executable_0]  changed parameters:
18:21:41 3: [test_executable_0]  deleted parameters:
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] Parameter event:
18:21:41 3: [test_executable_0]  new parameters:
18:21:41 3: [test_executable_0]  changed parameters:
18:21:41 3: [test_executable_0]   foo
18:21:41 3: [test_executable_0]  deleted parameters:
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] Parameter event:
18:21:41 3: [test_executable_0]  new parameters:
18:21:41 3: [test_executable_0]  changed parameters:
18:21:41 3: [test_executable_0]   bar
18:21:41 3: [test_executable_0]  deleted parameters:
18:21:41 3: [test_executable_0] 
18:21:41 3: [test_executable_0] signal_handler(2)
18:22:10     Test #3: test_tutorial_parameter_events_async__rmw_connext_cpp ...***Timeout  30.00 sec

@sloretz
Copy link

sloretz commented Mar 1, 2019

I haven't seen this failure on the 4 previous repeated nightlies on all platforms, so I'll optimistically close this as fixed.

@sloretz sloretz closed this as completed Mar 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants