rostest: fix flaky hztests #1661

beetleskin · 2019-03-15T17:19:53Z

Lets hope this reliefs some CI evergreen fails.

beetleskin · 2019-03-15T17:21:02Z

tools/rostest/test/hztest.test

@@ -6,19 +6,6 @@
  <param name="hztest1/hzerror" value="0.5" />
  <param name="hztest1/test_duration" value="5.0" />    
  <param name="hztest1/wait_time" value="21.0" />    


this is quite a long wait time, and also the default one, iirc - reduce to 5?

beetleskin · 2019-03-15T17:21:43Z

tools/rostest/test/hztest.test


-  <!-- Below also works:


pointless comment, yes this looks like it also works, but why should this be in here?

beetleskin · 2019-03-17T14:47:04Z

oh the irony ..

beetleskin · 2019-03-17T17:46:47Z

it would be nice if we could re-trigger the rostests a couple of times to see whether the changes result in more stable hztests

beetleskin · 2019-03-17T23:33:06Z

Alright, out of 19 checks with the same code, I got a couple of fails, but no hztest-related ones.

Are those also some evergreens in here?

cwecht · 2019-03-18T11:25:17Z

The jenkins internal failures occur from time to time. I'm not aware of one of the other failures occurring regularly. I really appreciate this PR. It would be nice if the other flaky tests could be fixed as well in the future, but I would like this to be merged as soon as possible.

dirk-thomas · 2019-03-18T16:20:51Z

@beetleskin I appreciate your testing effort and contribution. But please limit the amount of triggered PR jobs at the same time to a smaller number (<<10) otherwise this lets other jobs wait a long time before being processed.

It would be nice if the other flaky tests could be fixed as well in the future

Just for the record: this patch does not fix the flakyness of the test but retries them multiple times in the hope that they pass once. This patch adds the retry to 19 tests atm. The downside of this approach is that the time to run the tests increases. So this approach is not scalable. Actually fixing the flakiness of the test would be much better.

is there some effort going on to use the pipeline plugin for jenkins?

No.

beetleskin · 2019-03-18T19:18:41Z

please limit the amount of triggered PR jobs at the same time to a smaller number (<<10)

I had a look at the queue, there wasn't much going on; and I did it in 3 batches with hours inbetween. But you're right, I should reduce the number next time.

Just for the record: this patch does not fix the flakyness of the test but retries them multiple times in the hope that they pass once.

From the doc, this is the way to go:

Number of times to retry the test before it is considered a failure. Default is 0. This option is useful for stochastic processes that are sometimes expected to fail.

.. and due to the nature of ROS, and the (CI-) system it runs on, we must expect them to fail.

This patch adds the retry to 19 tests atm. The downside of this approach is that the time to run the tests increases.

For the case that a PR introduces changes to a published topic, such that the expected rate can not meet its expected rate, yes. We're trading a little bit of time for a much better ROC statistic. For the case that unrelated PRs fail due to those unstalbe tests, this will improve the whole CI process, and save developer nerves, reviewer time and CI hardware resources. And therefore: time.
Despite that, I set some wait_time params to lower values (default is the rediculous value of 21, which, I assume, is chosen so high due to CI in the first place?). Especially for expecting a zero-rate, the node was waiting 21 seconds to confirm that...
I could reduce this param for other tests as well. Or reduce the default to something reasonable. Or both. What do you think?

So this approach is not scalable. Actually fixing the flakiness of the test would be much better.

Right, lets rewrite ros_core and ros_comm then ;)
Honestly, I don't see any way of guaranteeing those tests, even with dedicated hardware, realtime OS and whatever else is needed to make asynchronous inter-process-communication deterministic. ROS itself prevents to waste any thoughts on this. Or am I missing something? Do you think there is a better way, i.e. improve hztest itself?

And I think this is as scalable as it gets. Given your CI system and rOS, the overhead of work generated by those flaky tests is much bigger than the potential (!) test-time increase.

is there some effort going on to use the pipeline plugin for jenkins?
No.

That's unfortunate.

beetleskin · 2019-03-18T19:28:38Z

Updated stats:
Total tests: 54

24x okay \o/
8x not even started/aborted/no response from CI (maybe manually aborted from annoyed developer, sry)
10x jenkins internal fail with no test results generated
10x RandomPlay.test_random_play
1x RosTest.testsub_to_multiple_pubs
1x RosTest.testpubsub_n_fast_udp
0x hztest

dirk-thomas · 2019-03-18T19:54:21Z

We're trading a little bit of time for a much better ROC statistic. For the case that unrelated PRs fail due to those unstalbe tests, this will improve the whole CI process, and save developer nerves, reviewer time and CI hardware resources. And therefore: time.

I am not saying this is not a feasible temporary workaround. I just mentioned that it would be a better solution to make the test non-flaky instead. E.g. 10min of additional CI time per job costs quite a bit when being done all the time. Obviously developer nerves and reviewer time are more valuable though.

I could reduce this param for other tests as well. Or reduce the default to something reasonable. Or both. What do you think?

The answer is probably different for each case. So I can't suggest something generic.

Do you think there is a better way, i.e. improve hztest itself?

The test itself could be accept more relaxed times but obviously that is also what it aims to catch. It is hard to write non-flaky performance tests which pass in a variety of environments.

is there some effort going on to use the pipeline plugin for jenkins?
No.
That's unfortunate.

It is simply a huge endeavor with very little benefit - so as long as nobody requests this feature (and is also willing to cover for the necessary development time) I doubt that is going to happen. Contributions towards that direction are always welcome though 😉

beetleskin · 2019-03-25T22:07:16Z

So .. anything missing for merging this?

cwecht · 2019-03-26T12:36:53Z

I don't think, that this PR can be merged. The 50 'trigger ci hook' commits should not go into the master.

beetleskin · 2019-03-26T13:52:47Z

I don't think, that this PR can be merged. The 50 'trigger ci hook' commits should not go into the master.

According to @dirk-thomas, they're going to be squashed into one commit.

test/test_rosbag/test/play_play.test.in

beetleskin · 2019-03-27T09:27:28Z

I don't think, that this PR can be merged. The 50 'trigger ci hook' commits should not go into the master.

However, @dirk-thomas might want to have a look at the generated merge-commit-messages: those commits might be part of the commit-msg body, similar to 40d3ca4

dirk-thomas · 2019-03-27T15:50:28Z

Thank you for these improvements!

* rostest: fix flaky hztests * add retry to all hztests * fix concerns * fix more wrong retry-attributes

rostest: fix flaky hztests

14e540e

beetleskin commented Mar 15, 2019

View reviewed changes

add retry to all hztests

e1d3962

StefanKaiser-TomTom added 18 commits March 17, 2019 18:51

trigger ci hook ..

f0f9d47

trigger ci hook ..

3f8a68f

trigger ci hook ..

19a4ac4

trigger ci hook ..

c52c7b6

trigger ci hook ..

4260569

trigger ci hook ..

a4463bf

trigger ci hook ..

6d4c228

trigger ci hook ..

5579938

trigger ci hook ..

d6c27c7

trigger ci hook ..

a29b555

trigger ci hook ..

3a1a166

trigger ci hook ..

f9052b3

trigger ci hook ..

3ba283b

trigger ci hook ..

7103e3a

trigger ci hook ..

25c2368

trigger ci hook ..

2d40ef1

trigger ci hook ..

44dc8cb

trigger ci hook ..

98da504

StefanKaiser-TomTom added 5 commits March 18, 2019 14:07

trigger ci hook ..

5a8e504

trigger ci hook ..

c781a1d

trigger ci hook ..

d94dea2

trigger ci hook ..

9be862a

trigger ci hook ..

ca27e09

StefanKaiser-TomTom added 10 commits March 18, 2019 16:48

trigger ci hook ..

fa77dc1

trigger ci hook ..

10d9607

trigger ci hook ..

fd5b870

trigger ci hook ..

b0b789b

trigger ci hook ..

11a1ad4

trigger ci hook ..

d264950

trigger ci hook ..

466d6f2

trigger ci hook ..

535df98

trigger ci hook ..

ba76e0a

trigger ci hook ..

606eda8

trigger ci hook ..

de9d70e

dirk-thomas reviewed Mar 26, 2019

View reviewed changes

test/test_rosbag/test/play_play.test.in Outdated Show resolved Hide resolved

dirk-thomas added the requires-changes label Mar 26, 2019

fix concerns

7f1f346

fix more wrong retry-attributes

d7db753

dirk-thomas removed the requires-changes label Mar 27, 2019

dirk-thomas merged commit a2876a1 into ros:melodic-devel Mar 27, 2019

beetleskin deleted the fix_flaky_hztests branch March 27, 2019 17:03

cwecht mentioned this pull request Apr 11, 2019

Fix nodes spins rather than blocking and waiting, using 100% CPU [melodic] #1651

Closed

tahsinkose pushed a commit to tahsinkose/ros_comm that referenced this pull request Apr 15, 2019

rostest: fix flaky hztests (ros#1661)

8fe9e68

* rostest: fix flaky hztests * add retry to all hztests * fix concerns * fix more wrong retry-attributes

dirk-thomas mentioned this pull request Aug 3, 2020

changes between 1.12.14 and 1.14.7 for backporting #2015

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rostest: fix flaky hztests #1661

rostest: fix flaky hztests #1661

beetleskin commented Mar 15, 2019

beetleskin Mar 15, 2019

beetleskin Mar 15, 2019

beetleskin commented Mar 17, 2019

beetleskin commented Mar 17, 2019

beetleskin commented Mar 17, 2019 •

edited

Loading

cwecht commented Mar 18, 2019

dirk-thomas commented Mar 18, 2019 •

edited

Loading

beetleskin commented Mar 18, 2019 •

edited

Loading

beetleskin commented Mar 18, 2019 •

edited

Loading

dirk-thomas commented Mar 18, 2019

beetleskin commented Mar 25, 2019

cwecht commented Mar 26, 2019

beetleskin commented Mar 26, 2019

beetleskin commented Mar 27, 2019

dirk-thomas commented Mar 27, 2019

rostest: fix flaky hztests #1661

rostest: fix flaky hztests #1661

Conversation

beetleskin commented Mar 15, 2019

beetleskin Mar 15, 2019

Choose a reason for hiding this comment

beetleskin Mar 15, 2019

Choose a reason for hiding this comment

beetleskin commented Mar 17, 2019

beetleskin commented Mar 17, 2019

beetleskin commented Mar 17, 2019 • edited Loading

cwecht commented Mar 18, 2019

dirk-thomas commented Mar 18, 2019 • edited Loading

beetleskin commented Mar 18, 2019 • edited Loading

beetleskin commented Mar 18, 2019 • edited Loading

dirk-thomas commented Mar 18, 2019

beetleskin commented Mar 25, 2019

cwecht commented Mar 26, 2019

beetleskin commented Mar 26, 2019

beetleskin commented Mar 27, 2019

dirk-thomas commented Mar 27, 2019

beetleskin commented Mar 17, 2019 •

edited

Loading

dirk-thomas commented Mar 18, 2019 •

edited

Loading

beetleskin commented Mar 18, 2019 •

edited

Loading

beetleskin commented Mar 18, 2019 •

edited

Loading