Sessions migration guide (#783)

Closes #595 Closes #1041 Closes #582 Closes #780 Closes #582 Closes #1156 Closes #577 Closes #576 - [x] Add text about job splitting, such as "If you split your workload into multiple jobs and run them in Batch mode, you can get results from individual jobs. You can, for example, decide to cancel the rest of the jobs if the earlier job results don't meet your expectations. If one of the jobs fail, you can also re-submit just that one instead of re-running the entire workload." There are several new and existing topics that are impacted: - This documentation section has several topics about execution modes: https://qiskit-docs-preview-pr-783.1799mxdls7qz.us-south.codeengine.appdomain.cloud/run/execution-modes - This migration guide is new: https://qiskit-docs-preview-pr-783.1799mxdls7qz.us-south.codeengine.appdomain.cloud/api/migration-guides/sessions --------- Co-authored-by: Ashley Silva <asarver1@gmail.com> Co-authored-by: Jessie Yu <jessieyu@us.ibm.com> Co-authored-by: abbycross <across@us.ibm.com>
Qiskit · Apr 29, 2024 · b0f3e82 · b0f3e82
1 parent 5ef433a
commit b0f3e82
Show file tree

Hide file tree

Showing 22 changed files with 808 additions and 120 deletions.
diff --git a/docs/api/migration-guides/_toc.json b/docs/api/migration-guides/_toc.json
@@ -31,6 +31,10 @@
         "title": "Migrate to local simulators",
         "url": "/api/migration-guides/local-simulators"
       },
+      {
+        "title": "Execution modes changes",
+        "url": "/api/migration-guides/sessions"
+      },
       {
         "title": "Migrate to Qiskit Runtime",
         "children": [
@@ -60,6 +64,19 @@
           }
         ]
       },
+      {
+        "title": "Qiskit Runtime 0.20 changes",
+        "children": [
+          {
+            "title": "V2 primitives",
+            "url": "/api/migration-guides/v2-primitives"
+          },
+          {
+            "title": "qiskit_ibm_provider to qiskit_ibm_runtime",
+            "url": "/api/migration-guides/qiskit-runtime-from-provider"
+          }
+        ]
+      },
       {
         "title": "Qiskit 0.44 changes",
         "children": [

diff --git a/docs/api/migration-guides/sessions.mdx b/docs/api/migration-guides/sessions.mdx
@@ -0,0 +1,97 @@
+---
+title: Execution mode changes
+description: Learn about the changes to execution modes (sessions, batch, and single jobs)
+
+---
+
+<span id="execution-modes"></span>
+
+# Execution modes changes
+
+Utility-scale workloads can take many hours to complete, so it is important that both the classical and quantum resources are scheduled efficiently to streamline the execution. The improved execution modes provide more flexibility than ever in balancing the cost and time tradeoff to use resources optimally for your workloads.
+
+Workloads can be run as single jobs, sessions, or in a batch:
+
+- Use **session** mode for iterative workloads, or if you need dedicated access to the system
+- Use **batch** mode to submit multiple primitive jobs simultaneously to shorten processing time
+- Use **job** mode to submit a single primitive request for quick testing
+
+The following table summarizes the differences:
+
+|     Mode     |                  Usage                  |                                                                                                        Benefit                                                                                                        |
+|:------------:|:---------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
+| Job mode     | Quantum computation only.               | Easiest to use when running a small experiment. Might run sooner than batch mode.                                                                                                                                     |
+| Batch mode   | Quantum computation only.               | The entire batch of jobs is scheduled together and there is no additional queuing time for each. Jobs in a batch are usually run close together.                                                                     |
+| Session mode | Both classical and quantum computation. | Dedicated and exclusive access to the system during the session active window, and no other users’ or system jobs can run. This is particularly useful for workloads that don’t have all inputs ready at the outset.  |
+
+## Best practices
+
+To ensure the most efficient use of the execution modes, the following practices are recommended:
+
+- Always close your session, either by using a context manager or by specifying `session.close()`.
+- There is a fixed overhead associated with running a job. In general, if each job uses less than one minute of QPU time, consider combining several into one larger job (this applies to all execution modes). "QPU time" refers to time spent by the QPU complex to process your job.
+
+    A job's QPU time is listed in the **Usage** column on the IBM Quantum Platform [Jobs](https://quantum.ibm.com/jobs) page, or you can  query it by using `qiskit-ibm-runtime` with this command `job.metrics()["usage"]["quantum_seconds"]`.
+
+- If each of your jobs consumes more than one minute of QPU time, or if combining jobs is not practical, you can still run multiple jobs in parallel. Every job goes through both classical and quantum processing. While a QPU can process only one job at a time, classical processing can be done in parallel. You can take advantage of this by submitting multiple jobs in [batch](#divide) or [session](#two-vqe) execution mode.
+
+The above are general guidelines, and you should tune your workload to find the optimal ratio, especially when using sessions. For example, if you are using a session to get exclusive access to a backend, consider breaking up large jobs into smaller ones and running them in parallel. This might be more cost effective because it can reduce wall clock time.
+
+
+## Sessions
+
+Sessions are designed for iterative workloads to avoid queuing delays between each iteration. All sessions now run in *dedicated* mode, so that when running a session, you have exclusive access to the backend.  Because of this, you are now charged for the total wall clock time that the system is reserved for your use. Additionally, sessions are now thread safe. That is, you can run multiple workloads within a session.
+
+<Admonition type="note">
+Session execution mode is not supported in the Open Plan. Jobs will run in job mode instead.
+</Admonition>
+
+<span id="two-vqe"></span>
+### Example: Run two VQE algorithms in a session by using threading
+
+```python
+from concurrent.futures import ThreadPoolExecutor
+from qiskit_ibm_runtime import  Session, EstimatorV2 as Estimator
+
+def minimize_thread(estimator, method):
+    minimize(cost_func, x0, args=(ansatz, hamiltonian, estimator), method=method)
+
+with Session(backend=backend), ThreadPoolExecutor() as executor:
+    # Add tags to differentiate jobs from different workloads.
+    estimator1.options.environment.job_tags = "cobyla"
+    estimator1.options.environment.job_tags = "nelder-mead"
+
+    cobyla_result = executor.submit(minimize_thread, estimator1, "cobyla").result()
+    nelder_mead_result = executor.submit(minimize_thread, estimator2, "nelder-mead").result()
+```
+
+## Batch
+
+Submit multiple primitive jobs simultaneously. When batching, classical processing is done in parallel. No session jobs, or jobs from another batch, can start when batch jobs are being processed; however, individual jobs might run between batch jobs.
+
+<span id="divide"></span>
+### Example: Partition a 500-circuit job into five 100-circuit jobs and run them in batch
+```python
+from qiskit_ibm_runtime import Batch, SamplerV2 as Sampler
+
+max_circuits = 100
+jobs = []
+start_idx = 0
+
+with Batch(backend):
+    sampler = Sampler()
+    while start_idx < len(circuits):
+        end_idx = start_idx + max_circuits
+        jobs.append(sampler.run([(circuits[start_ids:end_idx],)]))
+        start_idx = end_idx
+```
+
+## Sessions versus batch usage
+
+Usage is a measurement of the amount of time the system is locked for your workload.
+
+* Session usage is the time from when the first job starts until the session goes inactive, is closed, or when its last job completes, whichever happens **last**.
+* Batch usage is the sum of quantum time of all jobs in the batch.
+* Single job usage is the quantum time the job uses in processing.
+
+![This image shows multiple sets of jobs.  One set is being run in session mode and the other is being run in batch mode.  For session mode, between each job is the interactive TTL (time to live).  The active window starts when the first job starts and ends after the last job is completed. After the final job of the first set of jobs completes, the active window ends and the session is paused (but not closed).  Another set of jobs then starts and jobs continue in a similar manner. The system is reserved for your use during the entire session.  For batch mode, the classical computation part of each job happens simultaneously, then all jobs are sent to the system.  The system is locked for your use from the time the first job reaches the system until the last job is done processing on the system.  There is no gap between jobs where the system is idle.](/images/run/execution-modes/SessionVsBatch.svg 'Sessions compared to batch')
diff --git a/docs/run/_toc.json b/docs/run/_toc.json
@@ -44,6 +44,9 @@
       "title": "Execution modes",
       "children": [
         {
+          "title": "Introduction to execution modes",
+          "url": "/run/execution-modes"
+        },{
           "title": "About sessions",
           "url": "/run/sessions"
         },
@@ -54,6 +57,10 @@
         {
           "title": "Run jobs in a batch",
           "url": "/run/run-jobs-batch"
+        },
+        {
+          "title": "FAQs",
+          "url": "/run/execution-modes-faq"
         }
       ]
     },

diff --git a/docs/run/execution-modes-faq.mdx b/docs/run/execution-modes-faq.mdx
@@ -0,0 +1,177 @@
+---
+title: Execution modes FAQs
+description: Answers to commonly asked questions about Qiskit Runtime execution modes
+
+---
+# Qiskit Runtime execution modes FAQs
+
+<details>
+  <summary>
+    Does Qiskit Runtime local testing mode support different execution modes?
+  </summary>
+
+Local testing mode supports the syntax for the different execution modes, but because there is no scheduling involved when testing locally, the modes are ignored.
+</details>
+
+<details>
+  <summary>
+    How many jobs can run in parallel for a specific backend?
+  </summary>
+
+The number of jobs running in parallel is based on the degree of parallelism configured for the backend, which is five for most backends today.
+</details>
+
+
+## Sessions
+
+<details>
+  <summary>
+    What happens to my jobs if a session is closed?
+  </summary>
+
+If you are using the `Session` class in `qiskit-ibm-runtime`:
+
+  - `Session.close()` means the session no longer accepts new jobs, but existing jobs run to completion.
+  - `Session.cancel()` cancels all pending session jobs.
+
+If you are using the REST API directly:
+
+  - `PATCH /sessions/{id}` with `accepting_jobs=False` means the session no longer accepts new jobs, but existing jobs run to completion.
+  - `DELETE /sessions/{id}/close` cancels all pending session jobs.
+</details>
+
+<details>
+  <summary>
+    If I am using session mode and expect my experiment to take many hours, is there a way to ask for calibrations to happen?
+  </summary>
+
+No. On-demand calibration is not available.
+</details>
+
+<details>
+  <summary>
+    Is there an interactive timeout (ITTL) with session mode?
+  </summary>
+
+Yes. This reduces unwanted cost if a user forgets to close their session.
+</details>
+
+<details>
+  <summary>
+    How does session usage impact IBM Quantum Network members who are not billed by usage?
+  </summary>
+
+IBM Quantum Network members gain reserved capacity on IBM Quantum&trade; systems. Usage is deducted from this capacity and hubs with lower capacity have longer queueing time.
+</details>
+
+<details>
+  <summary>
+    Do I get the same parallelism in session mode that I get with batch mode?
+  </summary>
+
+Yes. If you submit multiple jobs simultaneously in a session, these jobs will run in parallel.
+</details>
+
+<details>
+  <summary>
+    Can sessions be interrupted by system jobs?
+  </summary>
+
+No.  Sessions run in dedicated mode, which means that the user has total access to the backend.  Sessions are never interrupted by system jobs, such as calibrations or software upgrades.
+</details>
+
+<details>
+  <summary>
+    Is compilation time counted as usage in session mode?
+  </summary>
+
+Yes.  In session mode, usage is the wall clock time the system is **committed to the session**. It starts when the first session job starts and ends when the session goes inactive, is closed, or when the last job completes, whichever happens **last**. Thus, usage continues to accumulate after a session ends if the system is still running a job. Additionally, time after a job completes while the system waits for another session job (the interactive time to live (ITTL)) counts as usage. This is why you should ensure the session is closed as soon as you are done submitting jobs to it.
+</details>
+
+## Batch
+
+<details>
+  <summary>
+    How many jobs run in parallel in batch mode?
+  </summary>
+
+The number of jobs running in parallel is based on the degree of parallelism configured for the backend, which is five for most backends. However, the number of concurrent jobs in an active batch could be lower because there could be other jobs already running when the batch becomes active.
+</details>
+
+<details>
+  <summary>
+    How is running _N_ PUBs in job mode different from running _N_ single-PUB jobs in batch mode?
+  </summary>
+
+The main difference is the time and cost tradeoff:
+
+Batch mode:
+
+- The total run time is less because the classical processing might run in parallel.
+- There is a slight overhead for running each job, so you end up paying a little more for batched jobs. This overhead correlates to the size of the job. For example, the total usage of two jobs, each containing 40 100x100 circuits, is six seconds more than a single job containing 80 circuits.
+- Because batch mode doesn't give you exclusive access to a backend, jobs inside a batch might run with other users' jobs or system calibration jobs.
+- If some jobs fail, you still get results from the completed jobs.
+- You can take action in the middle of a batch workload based on the results of completed jobs. For example, you can cancel the rest of the jobs if the initial results look incorrect.
+
+Job mode:
+
+- The total run time is likely to be higher because there is no parallelism.
+- You don't pay for the extra per-job overhead associated with batch workloads.
+- All of your circuits will run together.
+- If this single job fails, you don't get partial results.
+- Your job might hit the system limit if it contains too many circuits or if the circuits are too large.
+
+In general, if your each of your jobs consumes less than a minute of QPU time, consider combining them into a larger job (this applies to all execution modes).
+</details>
+
+<details>
+  <summary>
+    How many jobs I can submit in a batch?
+  </summary>
+
+There are no limits to how many jobs you can submit in a Batch. There are, however, limits on how much usage your jobs can consume based on your plan.
+</details>
+
+<details>
+  <summary>
+    When would my batch mode jobs run in parallel with other users' jobs?
+  </summary>
+
+The degree of parallelism configured for a backend is also called "execution lanes". If there are one or more execution lanes available, and your batch jobs are next in line to be run, the scheduler starts enough jobs to fill the lanes. Similarly, if your batch doesn't have enough jobs to fill the lanes, the scheduler starts other users' jobs.
+
+Example: The backend you choose has five execution lanes, and two of them are currently occupied by other users' jobs. Your batch of six jobs is next in line to be run.
+
+Because there are three available lanes, the scheduler starts three of your six batch jobs. It continues to start jobs in your batch as jobs finish and execution lanes become available. If a lane becomes available and there are no more jobs in your batch, the scheduler starts the next job in line.
+</details>
+
+<details>
+  <summary>
+    Do all of my batch jobs need to wait in the queue?
+  </summary>
+
+Because QPUs are limited and shared resources, all jobs need to wait in the queue. However, when the first job in your batch starts running, all the other jobs in that batch essentially jump to the front of the queue and are prioritized by the scheduler.
+</details>
+
+<details>
+  <summary>
+    Does a batch end automatically when the last associated job ends?
+  </summary>
+
+Yes. However, there is a slight overhead associated with this auto-detection, so you should always close your batch and session.
+</details>
+
+<details>
+  <summary>
+    Can batches be interrupted by system jobs?
+  </summary>
+
+Yes.  Batch workloads might be interrupted by system jobs, such as calibrations or software upgrades.
+</details>
+
+<details>
+  <summary>
+    Is compilation time counted as usage in batch mode?
+  </summary>
+
+No.  In batch mode, only time spent on the quantum hardware counts as usage.
+</details>