Multiple test failures when running tests -j12 #432

mgorny · 2022-12-28T13:49:24Z

Describe the bug
When I'm running the test suite with ctest -j12 (i.e. 12 parallel jobs), I'm getting 2-3 different test failures in a run. Over a few runs, the following tests failed:

	289 - test_fill_special (Failed)
	291 - test_frame_get_offsets (SEGFAULT)
	706 - test_schunk_frame (Failed)
	707 - test_schunk_header (Failed)
	709 - test_sframe (Failed)
	710 - test_sframe_lazychunk (Failed)

Segfaults are especially concerning.

To Reproduce

mkdir build
cd build
cmake .. -G Ninja -DCMAKE_INSTALL_PREFIX=/usr -DBUILD_STATIC=OFF -DBUILD_TESTS=yes -DBUILD_BENCHMARKS=OFF -DBUILD_EXAMPLES=OFF -DBUILD_FUZZERS=OFF -DDEACTIVATE_ZLIB=no -DDEACTIVATE_ZSTD=no -DPREFER_EXTERNAL_LZ4=ON -DPREFER_EXTERNAL_ZLIB=ON -DPREFER_EXTERNAL_ZSTD=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
ninja
ctest -j12

Expected behavior
Tests should pass when run in parallel.

Logs
LastTest.log from the last run: LastTest.log

System information:

OS: Gentoo Linux amd64
Compiler: gcc 12.2.1
Version: 2.6.1

The text was updated successfully, but these errors were encountered:

DimitriPapadopoulos · 2023-02-15T21:14:21Z

I am able to reproduce segfaults even with a mere ctest, without -j12:

$ ctest
Test project /my/path/c-blosc2/build
[...]
          Start 1736: b2nd_example_serialize
1736/1736 Test #1736: b2nd_example_serialize ....................................   Passed    0.00 sec

99% tests passed, 1 tests failed out of 1736

Label Time Summary:
b2nd    =   0.50 sec*proc (8 tests)

Total Test time (real) =  53.04 sec

The following tests FAILED:
	1703 - test_lz4_bitshuffle_n (SEGFAULT)
Errors while running CTest
Output from these tests are in: /my/path/c-blosc2/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
$ 
$ ctest --rerun-failed --output-on-failure
Test project /my/path/c-blosc2/build
    Start 1703: test_lz4_bitshuffle_n
1/1 Test #1703: test_lz4_bitshuffle_n ............   Passed    0.41 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.45 sec
$

As you can see, in my case, errors seem to differ between ctest runs. Do tests fail consistently for you, or “randomly” as in my case?

OS : Ubuntu 22.04
Compiler : GCC 11.3.0
Version : main branch

FrancescAlted · 2023-02-16T16:28:18Z

Today we have fixed something that may have created this: ca9d7c6

Could you give it another go?

mgorny · 2023-02-16T17:09:36Z

I can still reproduce.

FrancescAlted · 2023-02-16T17:40:39Z

Sorry, I was not explicit enough; I meant without parallelism (just ctest). For ctest -j12 this should require more work (although it is not a high priority).

DimitriPapadopoulos · 2023-02-17T07:37:54Z

I do not see segfaults without -j12 any more – but in that case segfaults were sporadic.

bnavigator · 2023-02-24T04:41:19Z

Still an issue with 2.7.1 and -j$N with N>1

keszybz · 2023-05-13T16:37:39Z

I'm seeing this too, c51d050 and v2.9.1. Most of the time there are test failures, but occasionally segfualts. I didn't capture a coredump yet.

The following tests FAILED:
302 - test_copy (Failed)
311 - test_frame_offset (Failed)
726 - test_schunk_header (Failed)
1722 - test_example_frame_offset (Failed)

The following tests FAILED:
302 - test_copy (Failed)
308 - test_fill_special (Failed)
310 - test_frame_get_offsets (Failed)
311 - test_frame_offset (Failed)
1315 - test_example_frame_simple (Failed)

The following tests FAILED:
11 - test_b2nd_copy (Failed)
302 - test_copy (Failed)

…

The failure rate is 100% (i.e. at least one) on multiple machines.

DimitriPapadopoulos · 2023-05-13T18:59:29Z

Tests could be modified to be run in a debugger. To get GDB to automatically print a backtrace in case of a crash:

gdb --batch --ex run --ex bt --args ./myprogram "$@" > gdb-backtrace.txt 2>&1

The above runs GDB in batch mode (--batch) and tells it to run the program (--ex run) and print a backtrace (--ex bt) if it crashes. The output is redirected to a file called gdb-backtrace.txt.

That said:

How to get ctest to run tests in the debugger as suggested above?
Tests running in the debugger might not crash.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple test failures when running tests -j12 #432

Multiple test failures when running tests -j12 #432

mgorny commented Dec 28, 2022 •

edited

Loading

DimitriPapadopoulos commented Feb 15, 2023

FrancescAlted commented Feb 16, 2023

mgorny commented Feb 16, 2023

FrancescAlted commented Feb 16, 2023

DimitriPapadopoulos commented Feb 17, 2023

bnavigator commented Feb 24, 2023

keszybz commented May 13, 2023

DimitriPapadopoulos commented May 13, 2023

Multiple test failures when running tests -j12 #432

Multiple test failures when running tests -j12 #432

Comments

mgorny commented Dec 28, 2022 • edited Loading

DimitriPapadopoulos commented Feb 15, 2023

FrancescAlted commented Feb 16, 2023

mgorny commented Feb 16, 2023

FrancescAlted commented Feb 16, 2023

DimitriPapadopoulos commented Feb 17, 2023

bnavigator commented Feb 24, 2023

keszybz commented May 13, 2023

DimitriPapadopoulos commented May 13, 2023

mgorny commented Dec 28, 2022 •

edited

Loading