try "ctest --rerun-failed" for handling flaky tests #2204

derekbruening · 2017-02-20T20:41:04Z

For Travis #1962 and AppVeyor #2145 we may want to try adding --rerun-failed to ctest.

Will our output parsing handle it?

Is this new? Which version of ctest added it? Can we rely on it being there? I don't remember seeing it when I first started using ctest.

derekbruening · 2017-02-25T21:18:58Z

OK this is not what I thought it was. Xref https://gitlab.kitware.com/cmake/cmake/issues/16314
What we really want is the "--retry-fail-until-pass N" being proposed which is not yet in the main branch.

derekbruening · 2019-05-01T18:16:17Z

Xref #2984.

derekbruening · 2023-05-20T01:44:02Z

We're using ctest_test(). Looks like in 3.17 it has support for retrying failed tests:

https://cmake.org/cmake/help/latest/command/ctest_test.html

REPEAT UNTIL_PASS:3 would retry up to 3 times.

This may make greatly improve our problems with flaky tests on some platforms.

derekbruening · 2023-05-20T01:48:55Z

I will try this out. We're currently requiring cmake 3.7. We should investigate the cmake versions on the GA CI VM's. Looks like the Ubuntu20 image has cmake 3.26: https://github.com/actions/runner-images/blob/main/images/linux/Ubuntu2004-Readme.md. We don't use Ubuntu 18 anymore but it looks like it has 3.10 so requiring 3.17 could affect some users.

derekbruening · 2023-05-20T01:50:41Z

The GA CI VS2019 image also has 3.26 https://github.com/actions/runner-images/blob/main/images/win/Windows2019-Readme.md

We could conditionally enable this feature if 3.17+ is detected, instead of mandating a new min version.

If cmake 3.17+ is in use, enables retrying of failed tests up to 3x with any one of them passing resulting in an overall pass. This avoids flaky tests marking the whole suite red, which is even more problematic with merge queues. Tested: I made a test which fails 3/4 of the time: -------------------------------------------------- add_test(bogus bash -c "exit \$((RANDOM % 4))") -------------------------------------------------- I made it easy to run just this test (gave it a label; disabled other builds, etc.) for convenience and then ran: -------------------------------------------------- $ ctest -VV -S ../src/suite/runsuite.cmake,64_only -------------------------------------------------- Which resulted in: -------------------------------------------------- test 1 Start 1: bogus 1: Test command: /usr/bin/bash "-c" "exit $((RANDOM % 4))" 1: Working Directory: /usr/local/google/home/bruening/dr/git/build_suite/build_debug-internal-64 1: Test timeout computed to be: 600 1/1 Test #1: bogus ............................***Failed 0.00 sec Start 1: bogus 1: Test command: /usr/bin/bash "-c" "exit $((RANDOM % 4))" 1: Working Directory: /usr/local/google/home/bruening/dr/git/build_suite/build_debug-internal-64 1: Test timeout computed to be: 600 Test #1: bogus ............................ Passed 0.00 sec 100% tests passed, 0 tests failed out of 1 -------------------------------------------------- Issue: #2204, #5873 Fixes #2204

derekbruening · 2023-05-22T18:12:08Z

Re-opening for two purposes:

We still have some tests that fail 3x in a row as noted here for scattergather ASSERT tool.drcachesim.scattergather test: tracer.cpp:394: towrite <= ipc_pipe.get_atomic_write_size() && towrite > 0 #5329: Adopt Github merge queues #5873 (comment). We should track these and take some action on them (fix; add to auto-ignore list).
While it is nice to have the commits and PR pages look greener, we would like to notice when new tests start failing intermittently. One idea is to have the post-merge Action have some scripting to identify failures that later passed and send an email (even if the final result is a green check).

derekbruening · 2023-05-24T14:49:46Z

scattergather once again failed 3x in a row for x86-32 with the pipe assert #5329: https://github.com/DynamoRIO/dynamorio/actions/runs/5063129492/jobs/9089390892?pr=6079

derekbruening · 2023-05-24T16:29:37Z

scattergather once again failed 3x in a row for x86-32 with the pipe assert #5329 for the merge to master of #6076. So it is failing 3x in a row on a regular basis.

derekbruening · 2023-05-24T16:34:04Z

PR #6069's merge to master failed for windows 32-bit https://github.com/DynamoRIO/dynamorio/actions/runs/5070388840/jobs/9105354355 but it's not clear why: the run seems truncated. Did it hit some 45-min time limit? I put a 40-min limit on the merge queue but this is not the merge queue. The Github limit is 6 hours.

derekbruening · 2023-05-24T21:24:56Z

Our hosted aarch64 runner was on cmake 3.16.3. I've upgraded it to 3.26.4.

derekbruening · 2023-05-24T21:38:18Z

#6081 failed 3x in a row on win32

derekbruening · 2023-05-25T12:47:56Z

tool.drcachesim.threads-with-config-file failed 3x in a row:
https://github.com/DynamoRIO/dynamorio/actions/runs/5075310358/jobs/9116397324?pr=6080
First it hit the type_is_instr assert which is #3320, followed by 2 90s timeouts: xref #4954 but that was release-build only and supposedly fixed?

derekbruening · 2023-05-25T12:50:18Z

For the tool.drcachesim.threads-with-config-file: probably it was the pipe file left behind which caused the 2 subsequent hangs! So 3x in a row doesn't help with online drcachesim where it has no mechanism to clean up the pipe file. Should we try to add some cleanup to the suite at least if we can't to the drcachesim app, launcher, or client?

derekbruening · 2023-06-09T03:01:05Z

win32 tool.drcacheoff.gencode failed 3x in a row with a timeout each time: https://github.com/DynamoRIO/dynamorio/actions/runs/5217398902/jobs/9417192797

ksco · 2023-06-09T04:50:11Z

win32 tool.drcacheoff.gencode and tool.drcacheoff.burst_replace failed due to timeout again: https://github.com/DynamoRIO/dynamorio/actions/runs/5217720624/jobs/9417832232#step:6:19819

derekbruening · 2023-06-09T14:18:57Z

These new timeouts are filed as #6131

ksco · 2023-06-09T17:09:22Z

I've tried multi times, these timeouts seem pretty "stable"..

derekbruening · 2023-06-09T17:16:17Z

I've tried multi times, these timeouts seem pretty "stable"..

Please move discussion to the dedicated issue #6131

derekbruening · 2023-06-13T17:40:37Z

drcachesim.coherence failed 3x in a row for x86-32 on a master merge at https://github.com/DynamoRIO/dynamorio/actions/runs/5258228536/jobs/9502167966: the first time was the type_is_instr assert #3320 but the 2nd two were timeouts (150s -- the test has a custom 150s timeout; though the prior pass was just 4s).

bete0 · 2023-06-15T17:20:46Z

release-external-64 failed to build once in vs2019-builds due to Error copying file "/.../dynamorio/build_release-external-64/lib64/release/../drconfiglib.dll" to "/.../dynamorio/build_release-external-64/suite/tests/bin/drconfiglib.dll".: https://github.com/DynamoRIO/dynamorio/actions/runs/5279972848/jobs/9551416845#step:6:1914.

derekbruening · 2023-06-15T17:47:16Z

release-external-64 failed to build once in vs2019-builds due to Error copying file "/.../dynamorio/build_release-external-64/lib64/release/../drconfiglib.dll" to "/.../dynamorio/build_release-external-64/suite/tests/bin/drconfiglib.dll".: https://github.com/DynamoRIO/dynamorio/actions/runs/5279972848/jobs/9551416845#step:6:1914.

That's the build race #5888 which does not get a 3x retry.

derekbruening · 2023-06-22T04:12:01Z

max-global hit its 240s timeout 3x in a row:
https://github.com/DynamoRIO/dynamorio/actions/runs/5341125248/jobs/9681644494

272/317 Test #265: code_api|tool.drcacheoff.max-global ..........................***Timeout 241.44 sec

derekbruening · 2023-06-28T17:26:39Z

max-global 240s timeout 3x in a row again: https://github.com/DynamoRIO/dynamorio/actions/runs/5403726034/jobs/9816933485

derekbruening · 2023-07-11T00:28:56Z

At https://github.com/DynamoRIO/dynamorio/actions/runs/5513552608/jobs/10051833149 we have a first failure #3320 assert in a drcachesim online tracing test:

295: Invalid trace entry type thread_exit (23) before a bundle
295: drcachesim: /home/runner/work/dynamorio/dynamorio/clients/drcachesim/reader/reader.cpp:215: virtual bool dynamorio::drmemtrace::reader_t::process_input_entry(): Assertion `type_is_instr(cur_ref_.instr.type) || cur_ref_.instr.type == TRACE_TYPE_INSTR_NO_FETCH' failed.
295/406 Test #295: code_api|tool.drcachesim.threads-with-config-file ................***Failed  Require

The next 2 retries both hit 90s timeouts -- presumably because of a stale pipe file! If we can't find any control point to clean up a pipe file from DR itself maybe we should add one to the online tests through some wrapper like a runmulti precmd or postcmd.

derekbruening · 2023-07-11T00:31:31Z

max-global yet again: https://github.com/DynamoRIO/dynamorio/actions/runs/5513299050/jobs/10051246750

ksco · 2023-07-17T08:42:20Z

x86-64-ubuntu22 code_api|api.rseq failed 3 times in a row: https://github.com/DynamoRIO/dynamorio/actions/runs/5573405524/jobs/10180633906?pr=6210#step:7:5779

#6185

ksco · 2023-07-17T08:43:19Z

x86-64 code_api|tool.drcacheoff.invariant_checker failed 3 times in a row: https://github.com/DynamoRIO/dynamorio/actions/runs/5573200427/jobs/10180157101?pr=6209#step:7:26203

#6212

derekbruening · 2023-07-26T14:29:35Z

max-global again: https://github.com/DynamoRIO/dynamorio/actions/runs/5669616130/job/15362725830

derekbruening · 2023-09-13T20:12:12Z

We're now seeing win32 invariant checker failing only on the merge queue with a message "Syscall marker not preceded by timestamp": but the pre-merge runs have all been green. We may disable the merge queue at least temporarily.

derekbruening added Component-Tests Hotlist-ContinuousIntegration labels Feb 20, 2017

derekbruening mentioned this issue May 1, 2019

have CDash share the flaky/ignore test list with CI #2984

Closed

derekbruening mentioned this issue May 1, 2019

i#3538 flaky maps mixup: mark static_maps_mixup_novars as FLAKY #3587

Merged

derekbruening mentioned this issue May 20, 2023

Adopt Github merge queues #5873

Closed

derekbruening self-assigned this May 20, 2023

derekbruening mentioned this issue May 20, 2023

i#2204: Retry flaky tests in suite #6075

Merged

derekbruening closed this as completed in c29398f May 20, 2023

derekbruening mentioned this issue May 22, 2023

ASSERT tool.drcachesim.scattergather test: tracer.cpp:394: towrite <= ipc_pipe.get_atomic_write_size() && towrite > 0 #5329

Closed

derekbruening reopened this May 22, 2023

derekbruening mentioned this issue May 22, 2023

fix failing tests in debug build long suite #1807

Open

derekbruening mentioned this issue May 24, 2023

Reuse Distance: Add data access histogram to collected stats. #6079

Merged

derekbruening mentioned this issue May 25, 2023

i#5843 scheduler: Add priority support #6080

Merged

derekbruening mentioned this issue May 25, 2023

samples_proj failing on aarchxx-native runs after cmake upgrade #6083

Closed

derekbruening mentioned this issue Jun 9, 2023

i#3544 RV64: Implement TLS functions #6091

Merged

derekbruening mentioned this issue Jun 9, 2023

HANG drcacheoff.gencode and drcacheoff.burst_replace on 32-bit Windows #6131

Closed

derekbruening mentioned this issue Jul 11, 2023

Fix EVEX vmovq encoding #6192

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

try "ctest --rerun-failed" for handling flaky tests #2204

try "ctest --rerun-failed" for handling flaky tests #2204

derekbruening commented Feb 20, 2017

derekbruening commented Feb 25, 2017

derekbruening commented May 1, 2019

derekbruening commented May 20, 2023

derekbruening commented May 20, 2023

derekbruening commented May 20, 2023

derekbruening commented May 22, 2023

derekbruening commented May 24, 2023

derekbruening commented May 24, 2023

derekbruening commented May 24, 2023

derekbruening commented May 24, 2023

derekbruening commented May 24, 2023

derekbruening commented May 25, 2023

derekbruening commented May 25, 2023

derekbruening commented Jun 9, 2023

ksco commented Jun 9, 2023

derekbruening commented Jun 9, 2023

ksco commented Jun 9, 2023

derekbruening commented Jun 9, 2023

derekbruening commented Jun 13, 2023

bete0 commented Jun 15, 2023

derekbruening commented Jun 15, 2023

derekbruening commented Jun 22, 2023

derekbruening commented Jun 28, 2023

derekbruening commented Jul 11, 2023

derekbruening commented Jul 11, 2023

ksco commented Jul 17, 2023 •

edited by derekbruening

Loading

ksco commented Jul 17, 2023 •

edited by derekbruening

Loading

derekbruening commented Jul 26, 2023

derekbruening commented Sep 13, 2023

try "ctest --rerun-failed" for handling flaky tests #2204

try "ctest --rerun-failed" for handling flaky tests #2204

Comments

derekbruening commented Feb 20, 2017

derekbruening commented Feb 25, 2017

derekbruening commented May 1, 2019

derekbruening commented May 20, 2023

derekbruening commented May 20, 2023

derekbruening commented May 20, 2023

derekbruening commented May 22, 2023

derekbruening commented May 24, 2023

derekbruening commented May 24, 2023

derekbruening commented May 24, 2023

derekbruening commented May 24, 2023

derekbruening commented May 24, 2023

derekbruening commented May 25, 2023

derekbruening commented May 25, 2023

derekbruening commented Jun 9, 2023

ksco commented Jun 9, 2023

derekbruening commented Jun 9, 2023

ksco commented Jun 9, 2023

derekbruening commented Jun 9, 2023

derekbruening commented Jun 13, 2023

bete0 commented Jun 15, 2023

derekbruening commented Jun 15, 2023

derekbruening commented Jun 22, 2023

derekbruening commented Jun 28, 2023

derekbruening commented Jul 11, 2023

derekbruening commented Jul 11, 2023

ksco commented Jul 17, 2023 • edited by derekbruening Loading

ksco commented Jul 17, 2023 • edited by derekbruening Loading

derekbruening commented Jul 26, 2023

derekbruening commented Sep 13, 2023

ksco commented Jul 17, 2023 •

edited by derekbruening

Loading

ksco commented Jul 17, 2023 •

edited by derekbruening

Loading