qa_volk_32f_s32f_convertpuppet_8u is flaky #646

argilo · 2023-10-13T16:08:38Z

CI often fails, and the culprit seems to be qa_volk_32f_s32f_convertpuppet_8u. This was added in #617.

The failure can be demonstrated locally:

$ while ctest -R convertpuppet --output-on-failure; do :; done
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.04 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.05 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.04 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...   Passed    0.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   0.03 sec
Test project /home/argilo/git/volk/build
    Start 49: qa_volk_32f_s32f_convertpuppet_8u
1/1 Test #49: qa_volk_32f_s32f_convertpuppet_8u ...***Failed    0.03 sec
RUN_VOLK_TESTS: volk_32f_s32f_convertpuppet_8u(131071,1)
generic completed in 2.02759 ms
u_avx2_fma completed in 0.161178 ms
a_avx2_fma completed in 0.160147 ms
u_avx2 completed in 0.16983 ms
a_avx2 completed in 0.169176 ms
u_sse2 completed in 0.273428 ms
a_sse2 completed in 0.273223 ms
u_sse completed in 0.615186 ms
a_sse completed in 0.613146 ms
offset 564 in1: 110 in2: 111 tolerance was: 0
volk_32f_s32f_convertpuppet_8u: fail on arch u_avx2_fma
offset 564 in1: 110 in2: 111 tolerance was: 0
volk_32f_s32f_convertpuppet_8u: fail on arch a_avx2_fma
Best aligned arch: a_avx2_fma
Best unaligned arch: u_avx2_fma


0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.03 sec

The following tests FAILED:
	 49 - qa_volk_32f_s32f_convertpuppet_8u (Failed)
Errors while running CTest

The text was updated successfully, but these errors were encountered:

daniestevez · 2023-10-13T16:11:29Z

Hmm. I'll try to look into this in more detail, but the first point I would start at is probably #617 (comment)

Edit: It would be nice to be able to see the input vector when the test fails. I suspect that this is caused by values that come out very close to a half integer (and so they might differ in rounding).

argilo · 2023-10-13T17:10:37Z

I added a bit of extra debug logging, and it does appear to be rounding differences that are causing the failures.

offset 25788 in: 0.230887 in1: 204 in2: 203 tolerance was: 0
volk_32f_s32f_convertpuppet_8u: fail on arch u_avx2_fma
offset 25788 in: 0.230887 in1: 204 in2: 203 tolerance was: 0
volk_32f_s32f_convertpuppet_8u: fail on arch a_avx2_fma

$ python3
>>> 0.230887 * 327.0 + 128
203.500049

327.0 is the default scalar value used in tests:

volk/lib/testqa.cc

Line 30 in a26a1b8

lv_32fc_t def_scalar = 327.0;

argilo · 2023-10-13T17:25:10Z

The test framework is very rigid, so the only workable fix I can see is to increase the tolerance to 1. I see you already suggested that here: #617 (comment)

argilo · 2023-10-13T17:38:05Z

PR: #647

daniestevez · 2023-10-13T17:51:14Z

The debugging you did shows that I was on the right track in the original PR regarding rounding of values close to a half integer. The reason that FMA works differently than non-FMA kind of makes sense in hardware, because maybe for FMA you are able to carry more precision internally before doing the add, or need a different kind of rounding to meet performance, or whatever. I also have the impression that this does not behave the same in all machines, because on my AMD machine I could not make it fail. I'll try once again with your loop just to make sure.

daniestevez · 2023-10-13T18:15:20Z

I've run the test loop for 20 minutes in my AMD machine and it never failed. It's a Ryzen 7 5800X. So apparently there's not even a spec about how the FMA rounding should behave.

argilo mentioned this issue Oct 13, 2023

Fix flaky qa_volk_32f_s32f_convertpuppet_8u #647

Merged

jdemel closed this as completed in #647 Oct 20, 2023

argilo mentioned this issue Oct 22, 2023

qa_volk_32f_x2_dot_prod_16i is flaky #669

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qa_volk_32f_s32f_convertpuppet_8u is flaky #646

qa_volk_32f_s32f_convertpuppet_8u is flaky #646

argilo commented Oct 13, 2023

daniestevez commented Oct 13, 2023 •

edited

Loading

argilo commented Oct 13, 2023 •

edited

Loading

argilo commented Oct 13, 2023

argilo commented Oct 13, 2023

daniestevez commented Oct 13, 2023

daniestevez commented Oct 13, 2023

qa_volk_32f_s32f_convertpuppet_8u is flaky #646

qa_volk_32f_s32f_convertpuppet_8u is flaky #646

Comments

argilo commented Oct 13, 2023

daniestevez commented Oct 13, 2023 • edited Loading

argilo commented Oct 13, 2023 • edited Loading

argilo commented Oct 13, 2023

argilo commented Oct 13, 2023

daniestevez commented Oct 13, 2023

daniestevez commented Oct 13, 2023

daniestevez commented Oct 13, 2023 •

edited

Loading

argilo commented Oct 13, 2023 •

edited

Loading