Flaky test in `rem/tests/test_inverse_confusion_matrix.py` #2431

purva-thakre · 2024-07-01T17:20:18Z

Noticed this failure twice in a CI run for Windows. If I rerun the test, it passes.

https://github.com/unitaryfund/mitiq/actions/runs/9748119578/job/26902405324?pr=2347#step:6:4326

purva-thakre · 2024-07-02T19:25:11Z

Noticed the same failure for windows-python 3.11

https://github.com/unitaryfund/mitiq/actions/runs/9766045422/job/26958119314?pr=2432#step:6:4325

purva-thakre · 2024-07-02T19:55:35Z

The same failure on windows-python3.12

https://github.com/unitaryfund/mitiq/actions/runs/9765921429/job/26957722256#step:6:4326

purva-thakre · 2024-07-10T01:49:47Z

I am unassigning this for myself because I cannot replicate this locally. My laptop does not have enough memory for local docker containers.

natestemen · 2024-08-06T16:11:21Z

Looks like this is also happening on ubuntu: https://github.com/unitaryfund/mitiq/actions/runs/10267476608/job/28408109210?pr=2452#step:6:4486

natestemen · 2024-08-06T22:06:06Z

Took a look at this in the mitiq coding call today and made some progress.

Problem

mitiq/mitiq/rem/tests/test_inverse_confusion_matrix.py

Lines 32 to 33 in 56ee173

    
           bitstrings = sample_probability_vector(np.array([0.5, 0.5]), 1000) 
        
           assert isclose(sum([b[0] for b in bitstrings]), 500, rel_tol=0.1)

This test (in essence)

generates a random sequence of 1000 0s and 1s
sums that sequence (as ints)
Asserts that the resulting sum is between 450 and 550

Some stats tell us the sum should fall in this range 99.94% of the time, meaning that on average we should see a value outside this range every 1,667 runs.

Solutions

We considered the following solutions

Removing the test. Since this is non-deterministic behavior and the test is mostly testing to np.random.choice, it is not totally needed.
Increasing the valid range of values (say it must fall between 400 and 600).
Setting a seed to make the sum deterministic.

Solution 1 was determined to be unnecessary since we still want this code's functionality to be tested with cases other than the rudimentary ones tested in this same test. Solution 2 was determined to not be a complete fix, as it is still possible for the test to fail, albeit much less likely. Solution 3 was deemed the best solution.

Diagnosis

We ran the failing test on my machine (macos) on repeat 2000 times using the approach outlined here. We found it failed roughly at that cadence (1/2000 runs) which aligns with the math in the section above.

natestemen · 2024-08-07T04:23:30Z

And just so no one else clashes here, I have a PR incoming.

nathanshammah · 2024-08-07T13:51:31Z

Great investigation!

purva-thakre added bug Something isn't working rem Readout Error Mitigation technique labels Jul 1, 2024

purva-thakre mentioned this issue Jul 2, 2024

make upload versions unique for TestPyPI #2436

Merged

purva-thakre mentioned this issue Jul 3, 2024

Flaky test on Windows for REM #2437

Closed

6 tasks

purva-thakre self-assigned this Jul 3, 2024

purva-thakre added this to the 0.39.0 milestone Jul 3, 2024

purva-thakre removed their assignment Jul 10, 2024

purva-thakre removed this from the 0.39.0 milestone Jul 10, 2024

natestemen mentioned this issue Jul 10, 2024

git ignore all coverage reports #2443

Merged

purva-thakre added the os: windows label Jul 21, 2024

purva-thakre mentioned this issue Jul 22, 2024

Separate docs build workflow #2441

Merged

6 tasks

natestemen mentioned this issue Aug 6, 2024

fix docstring formatting #2459

Merged

natestemen removed the os: windows label Aug 7, 2024

natestemen self-assigned this Aug 7, 2024

cosenal mentioned this issue Aug 8, 2024

Bump sphinx from 7.2.6 to 8.0.2 #2455

Merged

natestemen mentioned this issue Aug 9, 2024

Fix flaky REM test / refactor #2464

Merged

purva-thakre mentioned this issue Aug 13, 2024

Add tags to tutorials #2467

Merged

9 tasks

natestemen closed this as completed in #2464 Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky test in `rem/tests/test_inverse_confusion_matrix.py` #2431

Flaky test in `rem/tests/test_inverse_confusion_matrix.py` #2431

purva-thakre commented Jul 1, 2024 •

edited

Loading

purva-thakre commented Jul 2, 2024

purva-thakre commented Jul 2, 2024

purva-thakre commented Jul 10, 2024

natestemen commented Aug 6, 2024

natestemen commented Aug 6, 2024

natestemen commented Aug 7, 2024

nathanshammah commented Aug 7, 2024

Flaky test in rem/tests/test_inverse_confusion_matrix.py #2431

Flaky test in rem/tests/test_inverse_confusion_matrix.py #2431

Comments

purva-thakre commented Jul 1, 2024 • edited Loading

purva-thakre commented Jul 2, 2024

purva-thakre commented Jul 2, 2024

purva-thakre commented Jul 10, 2024

natestemen commented Aug 6, 2024

natestemen commented Aug 6, 2024

Problem

Solutions

Diagnosis

natestemen commented Aug 7, 2024

nathanshammah commented Aug 7, 2024

Flaky test in `rem/tests/test_inverse_confusion_matrix.py` #2431

Flaky test in `rem/tests/test_inverse_confusion_matrix.py` #2431

purva-thakre commented Jul 1, 2024 •

edited

Loading