-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
try "ctest --rerun-failed" for handling flaky tests #2204
Comments
OK this is not what I thought it was. Xref https://gitlab.kitware.com/cmake/cmake/issues/16314 |
Xref #2984. |
We're using ctest_test(). Looks like in 3.17 it has support for retrying failed tests: https://cmake.org/cmake/help/latest/command/ctest_test.html
This may make greatly improve our problems with flaky tests on some platforms. |
I will try this out. We're currently requiring cmake 3.7. We should investigate the cmake versions on the GA CI VM's. Looks like the Ubuntu20 image has cmake 3.26: https://github.com/actions/runner-images/blob/main/images/linux/Ubuntu2004-Readme.md. We don't use Ubuntu 18 anymore but it looks like it has 3.10 so requiring 3.17 could affect some users. |
The GA CI VS2019 image also has 3.26 https://github.com/actions/runner-images/blob/main/images/win/Windows2019-Readme.md We could conditionally enable this feature if 3.17+ is detected, instead of mandating a new min version. |
If cmake 3.17+ is in use, enables retrying of failed tests up to 3x with any one of them passing resulting in an overall pass. This avoids flaky tests marking the whole suite red, which is even more problematic with merge queues. Tested: I made a test which fails 3/4 of the time: -------------------------------------------------- add_test(bogus bash -c "exit \$((RANDOM % 4))") -------------------------------------------------- I made it easy to run just this test (gave it a label; disabled other builds, etc.) for convenience and then ran: -------------------------------------------------- $ ctest -VV -S ../src/suite/runsuite.cmake,64_only -------------------------------------------------- Which resulted in: -------------------------------------------------- test 1 Start 1: bogus 1: Test command: /usr/bin/bash "-c" "exit $((RANDOM % 4))" 1: Working Directory: /usr/local/google/home/bruening/dr/git/build_suite/build_debug-internal-64 1: Test timeout computed to be: 600 1/1 Test #1: bogus ............................***Failed 0.00 sec Start 1: bogus 1: Test command: /usr/bin/bash "-c" "exit $((RANDOM % 4))" 1: Working Directory: /usr/local/google/home/bruening/dr/git/build_suite/build_debug-internal-64 1: Test timeout computed to be: 600 Test #1: bogus ............................ Passed 0.00 sec 100% tests passed, 0 tests failed out of 1 -------------------------------------------------- Issue: #2204, #5873 Fixes #2204
Re-opening for two purposes:
|
scattergather once again failed 3x in a row for x86-32 with the pipe assert #5329: https://github.com/DynamoRIO/dynamorio/actions/runs/5063129492/jobs/9089390892?pr=6079 |
PR #6069's merge to master failed for windows 32-bit https://github.com/DynamoRIO/dynamorio/actions/runs/5070388840/jobs/9105354355 but it's not clear why: the run seems truncated. Did it hit some 45-min time limit? I put a 40-min limit on the merge queue but this is not the merge queue. The Github limit is 6 hours. |
Our hosted aarch64 runner was on cmake 3.16.3. I've upgraded it to 3.26.4. |
#6081 failed 3x in a row on win32 |
|
For the |
win32 |
win32 |
These new timeouts are filed as #6131 |
I've tried multi times, these timeouts seem pretty "stable".. |
Please move discussion to the dedicated issue #6131 |
drcachesim.coherence failed 3x in a row for x86-32 on a master merge at https://github.com/DynamoRIO/dynamorio/actions/runs/5258228536/jobs/9502167966: the first time was the type_is_instr assert #3320 but the 2nd two were timeouts (150s -- the test has a custom 150s timeout; though the prior pass was just 4s). |
|
That's the build race #5888 which does not get a 3x retry. |
max-global hit its 240s timeout 3x in a row:
|
max-global 240s timeout 3x in a row again: https://github.com/DynamoRIO/dynamorio/actions/runs/5403726034/jobs/9816933485 |
At https://github.com/DynamoRIO/dynamorio/actions/runs/5513552608/jobs/10051833149 we have a first failure #3320 assert in a drcachesim online tracing test:
The next 2 retries both hit 90s timeouts -- presumably because of a stale pipe file! If we can't find any control point to clean up a pipe file from DR itself maybe we should add one to the online tests through some wrapper like a runmulti precmd or postcmd. |
x86-64-ubuntu22 code_api|api.rseq failed 3 times in a row: https://github.com/DynamoRIO/dynamorio/actions/runs/5573405524/jobs/10180633906?pr=6210#step:7:5779 |
x86-64 code_api|tool.drcacheoff.invariant_checker failed 3 times in a row: https://github.com/DynamoRIO/dynamorio/actions/runs/5573200427/jobs/10180157101?pr=6209#step:7:26203 |
We're now seeing win32 invariant checker failing only on the merge queue with a message "Syscall marker not preceded by timestamp": but the pre-merge runs have all been green. We may disable the merge queue at least temporarily. |
For Travis #1962 and AppVeyor #2145 we may want to try adding --rerun-failed to ctest.
Will our output parsing handle it?
Is this new? Which version of ctest added it? Can we rely on it being there? I don't remember seeing it when I first started using ctest.
The text was updated successfully, but these errors were encountered: