Mock hardware when testing notebooks #1184

frankharkins · 2024-04-16T15:20:29Z

This uses the approach discussed in #1174 to test notebooks that submit jobs to IBM Quantum.

The script behaviour has changed a bit:

We now have a list of notebooks_no_mock, these could include notebooks with large circuits or that demonstrate features not available on the fake backend. Future work could partially test these using the approach in #1173.
By default, it runs all notebooks except notebooks_no_mock, with the patched package.
--submit-jobs will run all notebooks, without the patched package.
--only-unmockable will run only notebooks_no_mock, without the patched package.

The following notebooks do not work with this patch, but some of them might not have worked anyway.

tutorials/variational-quantum-eigensolver/notebook.ipynb
tutorials/submitting-transpiled-circuits/notebook.ipynb
tutorials/build-repitition-codes/notebook.ipynb

scripts/nb-tester/test-notebook.py

Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>

Eric-Arellano

Great work, Frank!

It would be good to start running this change in CI. That's fine if you prefer it to be a follow up. For the broken notebooks, we can exclude them with an ignore list and then open up an issue to fix them

scripts/nb-tester/test-notebook.py

Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>

frankharkins · 2024-04-17T16:08:30Z

It would be good to start running this change in CI.

This will take effect as soon as the PR is merged; is that ok? RE ignore list: We only run changed notebooks on PR so the broken notebooks shouldn't block anyone who's not working on them.

Eric-Arellano · 2024-04-17T17:54:13Z

The code looks good, but let's wait to merge this until we decide how to deal with Becky's concern that some of our output will leak that we're running on simulators

Maybe we should not apply this patch for certain allow-listed notebooks, like the ones about how to interact with the hardware? Or we allow-list which notebooks should be mocked? Always using simulators unconditionally seems like it can cause issues.

frankharkins · 2024-04-17T18:33:14Z

This feature should only be used to test code and catch bugs, not to generate notebook outputs. It does give you the option to write the simulator results to the notebook, but I don't think it's that dangerous. We could write a test for it if needed.

Eric-Arellano · 2024-04-17T18:46:27Z

This feature should only be used to test code and catch bugs, not to generate notebook outputs.

But we instruct people to get output from the PR CI build:

documentation/README.md

Lines 127 to 131 in 124e143

    
           When you make a pull request with a changed notebook, you can get a version of 
        
           that notebook that was executed in a uniform environment from CI. To do this, 
        
           click "Show all checks" in the info box at the bottom of the pull request page 
        
           on GitHub, then choose "Details" for the "Test notebooks" job. From the job 
        
           page, click "Summary", then download "Executed notebooks".

And when they run locally with tox, it will be using the patch. We have two conflicting goals:

Test in every PR that the notebook has not broke
Provide a way to get the output, ideally using hardware. Do this either locally or from a CI artifact

Perhaps we should consider something like this?

Make --patch an argument to the script, so by default we still run on hardware.
In CI for PRs, test that notebooks work with --patch. At least changed notebooks, maybe all notebooks if it isn't too slow
In CI for PRs, also run changed notebooks on hardware if they aren't too slow. That's so that you can get the output of the notebook from CI

frankharkins · 2024-04-17T19:06:22Z

We have two conflicting goals:

Test in every PR that the notebook has not broke

Provide a way to get the output, ideally using hardware. Do this either locally or from a CI artifact

This PR (mostly) solves 1. It doesn't regress on 2 as we don't run job-submitting notebooks in CI anyway. Writers can still get hardware outputs locally using --submit-jobs --write, same as before.

I'll update the README with the caveat about getting a notebook from CI. Maybe I can emit a warning whenever the patched least_busy is called so we don't accidentally commit the results to CI.

Make --patch an argument to the script, so by default we still run on hardware.

In CI for PRs, test that notebooks work with --patch. At least changed notebooks, maybe all notebooks if it isn't too slow

In CI for PRs, also run changed notebooks on hardware if they aren't too slow. That's so that you can get the output of the notebook from CI

The first two bullets are basically what this PR does, just the argument to not patch is --submit-jobs. The last bullet is sadly not feasible IMO, it's really unlikely a job will run in a period of time acceptable for PR CI, they'll also use up a lot of device time if someone pushes commits regularly.

frankharkins · 2024-04-17T19:08:11Z

I'm hesitant to make submitting jobs the default as it'd be easy for a writer to unknowingly send lots of jobs to IBM Quantum, and if they kill the script while waiting, those jobs would continue to run.

Eric-Arellano · 2024-04-18T15:11:22Z

It doesn't regress on 2 as we don't run job-submitting notebooks in CI anyway.

Non-job-submitting notebooks will now run with the patched simulator. For example:

documentation/docs/run/get-backend-information.ipynb

Line 163 in 7672590

"service.least_busy(operational=True, min_num_qubits=5)"

I wasn't very clear in differentiating between submitting jobs vs using Runtime but not for a job. It seems we have a few categories of notebooks:

Don't submit a job but use Runtime, like get-backend-information.ipynb. Should probably not be patched.
Submit a job that can be simulated. Should be patched for the sake of testing, but for output it sounds like we want it to be from actual hardware
Submit a job that cannot be simulated. We cannot patch, so we can only verify via submitting a job. Output needs to come from hardware too.

This PR will regress with the first category of notebooks like get-backend-information.ipynb. Before, they ran with real hardware in CI and you could download their notebook to get the output. Now, they will use the patch.

--

What if we had our CI reflect these 3 categories?

Notebooks that don't submit a job: do not patch them, and upload their output to CI. So you get validation & output.
Notebooks that submit a simulatable job: patch them, but do not upload their output to CI. You get validation. For output, you must run it locally using --submit-jobs
Notebooks that submit a non-simulatable job: ignore them in PR builds. Run in cron job and upload those artifacts

In the script itself, notebook authors would need to categorize which of the 3 buckets it falls into.

Wdyt?

…H/mock-runtime

frankharkins · 2024-04-26T13:30:50Z

This is now working (including cells with backend.name; see c89e972 🤦).

Screenshot

…H/mock-runtime

Eric-Arellano

Nice! Is my understanding correct?

If the notebook is not listed in the TOML file, we don't patch it. We do write the output
If the notebook is in submit-jobs:
- if you use --submit-jobs, we use Runtime and --write will work
- if you don't use --submit-jobs, we use simulators and --write will be ignored
If the notebook is in no_mock:
- always skipped unless you use --submit-jobs or --only-unmockable

Sgtm. This README should probably be updated. Maybe mention how we use simulation in CI to test notebooks that submit jobs, but we do not upload their output?

When you make a pull request with a changed notebook, you can get a version of that notebook that was executed in a uniform environment from CI. To do this, click "Show all checks" in the info box at the bottom of the pull request page on GitHub, then choose "Details" for the "Test notebooks" job. From the job page, click "Summary", then download "Executed notebooks".

--

Also PR description could use an update

scripts/nb-tester/qiskit_docs_notebook_tester/__init__.py

Eric-Arellano

Great work!

Eric-Arellano · 2024-04-26T16:27:59Z

.github/workflows/notebook-test-cron.yml

@@ -30,7 +30,7 @@ jobs:

      - name: Execute notebooks
        timeout-minutes: 350
-        run: tox -- --write --only-submit-jobs
+        run: tox -- --write --only-unmockable


In our meeting, we were planning on this running all submit-jobs notebooks so that we can ensure they work on hardware + download the output.

I've changed the option back to --only-submit-jobs, which will run all job-submitting notebooks without the mock backend.

README.md

Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>

…H/mock-runtime

We only really use this in the cron job, in which we want to run all job-submitting notebooks.

scripts/nb-tester/qiskit_docs_notebook_tester/__init__.py

Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>

This uses the approach discussed in Qiskit#1174 to test notebooks that submit jobs to IBM Quantum. The script behaviour has changed a bit: * We now have a list of `notebooks_no_mock`, these could include notebooks with large circuits or that demonstrate features not available on the fake backend. Future work could partially test these using the approach in [Qiskit#1173](Qiskit#1173). * By default, it runs all notebooks except `notebooks_no_mock`, with the patched package. * `--submit-jobs` will run **all** notebooks, without the patched package. * `--only-unmockable` will run only `notebooks_no_mock`, without the patched package. The following notebooks do **not** work with this patch, but some of them might not have worked anyway. * `tutorials/variational-quantum-eigensolver/notebook.ipynb` * `tutorials/submitting-transpiled-circuits/notebook.ipynb` * `tutorials/build-repitition-codes/notebook.ipynb` --------- Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>

frankharkins added 2 commits April 16, 2024 15:43

Patch least_busy to return FakeManilaV2

076bfb5

Change notebook organization

77234ce

frankharkins linked an issue Apr 16, 2024 that may be closed by this pull request

Wildcard: allow running our notebooks with simulators? #1174

Closed

Eric-Arellano reviewed Apr 16, 2024

View reviewed changes

frankharkins and others added 3 commits April 16, 2024 17:49

Use context manager to modify notebooks

ddb05c9

FakeManilaV2 -> FakeWashingtonV2

fc6829e

Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>

Update scripts/nb-tester/test-notebook.py

6e28257

Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>

frankharkins marked this pull request as ready for review April 16, 2024 17:16

Eric-Arellano approved these changes Apr 17, 2024

View reviewed changes

scripts/nb-tester/test-notebook.py Outdated Show resolved Hide resolved

scripts/nb-tester/test-notebook.py Outdated Show resolved Hide resolved

frankharkins and others added 2 commits April 17, 2024 16:39

Use @contextmanager decorator

5e391ac

Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>

Make submit_jobs true if only-unmockable

90a348b

frankharkins marked this pull request as draft April 18, 2024 16:06

frankharkins mentioned this pull request Apr 24, 2024

Refresh outputs in all notebooks #1226

Closed

frankharkins added 6 commits April 25, 2024 11:08

Merge branch 'main' of https://github.com/Qiskit/documentation into F…

c28c165

…H/mock-runtime

(Untested) Add new notebook groups

64820eb

Update help message

e4ab1d9

typo

dc0eb19

Fix problems with fake vs real backend

c89e972

FakeWashingtonV2 -> FakeKyoto

d969a8e

frankharkins marked this pull request as ready for review April 26, 2024 13:30

Merge branch 'main' of https://github.com/Qiskit/documentation into F…

68d1cba

…H/mock-runtime

Eric-Arellano reviewed Apr 26, 2024

View reviewed changes

scripts/nb-tester/qiskit_docs_notebook_tester/__init__.py Outdated Show resolved Hide resolved

frankharkins added 2 commits April 26, 2024 15:25

use variable rather than new function

a193345

Update README

c72be90

Eric-Arellano approved these changes Apr 26, 2024

View reviewed changes

frankharkins and others added 5 commits May 1, 2024 16:37

Apply suggestions from code review

02e8783

Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>

Merge branch 'main' of https://github.com/Qiskit/documentation into F…

5fce1aa

…H/mock-runtime

Run all notebooks in cron job

2106b24

--only-unmockable -> --only-submit-jobs

6096725

We only really use this in the cron job, in which we want to run all job-submitting notebooks.

Use --only-submit-jobs in cron job

a057268

Eric-Arellano reviewed May 2, 2024

View reviewed changes

scripts/nb-tester/qiskit_docs_notebook_tester/__init__.py Outdated Show resolved Hide resolved

Concat lists in separate variable

cbf4fff

Eric-Arellano approved these changes May 2, 2024

View reviewed changes

scripts/nb-tester/qiskit_docs_notebook_tester/__init__.py Outdated Show resolved Hide resolved

Improve code

ab905db

Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>

frankharkins enabled auto-merge May 2, 2024 15:50

frankharkins added this pull request to the merge queue May 2, 2024

Merged via the queue into main with commit a44f5aa May 2, 2024
2 checks passed

frankharkins deleted the FH/mock-runtime branch May 2, 2024 15:52

Eric-Arellano mentioned this pull request May 8, 2024

[epic] revamp how we test notebooks #1326

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mock hardware when testing notebooks #1184

Mock hardware when testing notebooks #1184

frankharkins commented Apr 16, 2024 •

edited

Loading

Eric-Arellano left a comment

frankharkins commented Apr 17, 2024

Eric-Arellano commented Apr 17, 2024

frankharkins commented Apr 17, 2024

Eric-Arellano commented Apr 17, 2024

frankharkins commented Apr 17, 2024

frankharkins commented Apr 17, 2024

Eric-Arellano commented Apr 18, 2024

frankharkins commented Apr 26, 2024 •

edited

Loading

Eric-Arellano left a comment

Eric-Arellano left a comment

Eric-Arellano Apr 26, 2024

frankharkins May 2, 2024

Mock hardware when testing notebooks #1184

Mock hardware when testing notebooks #1184

Conversation

frankharkins commented Apr 16, 2024 • edited Loading

Eric-Arellano left a comment

Choose a reason for hiding this comment

frankharkins commented Apr 17, 2024

Eric-Arellano commented Apr 17, 2024

frankharkins commented Apr 17, 2024

Eric-Arellano commented Apr 17, 2024

frankharkins commented Apr 17, 2024

frankharkins commented Apr 17, 2024

Eric-Arellano commented Apr 18, 2024

frankharkins commented Apr 26, 2024 • edited Loading

Eric-Arellano left a comment

Choose a reason for hiding this comment

Eric-Arellano left a comment

Choose a reason for hiding this comment

Eric-Arellano Apr 26, 2024

Choose a reason for hiding this comment

frankharkins May 2, 2024

Choose a reason for hiding this comment

frankharkins commented Apr 16, 2024 •

edited

Loading

frankharkins commented Apr 26, 2024 •

edited

Loading