-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore possible regression in local simulation performance #1631
Comments
This issue is related to my comment at Qiskit/qiskit-addon-cutting#552 (comment) and the following discussion. To reproduce, just run any of our CKT tutorials with a fake backend that has >100 qubits. |
I identified the root cause of the performance regression. We updated Primitives V2 to enable SamplerV2 and EstimatorV2 to handle different numbers of shots or precision for each pub. As a potential solution, I am considering revising BackendSamplerV2 and BackendEstimatorV2 to combine pubs with identical shot or precision settings. Here is a script to show the performance regression from timeit import timeit
from qiskit_aer import AerSimulator
from qiskit_ibm_runtime.fake_provider import FakeSherbrooke
from qiskit import QuantumCircuit
from qiskit.primitives import BackendSampler, BackendSamplerV2
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager
backend = AerSimulator.from_backend(FakeSherbrooke())
shots = 10000
num_copies = 10
def gen_circuit(num_qubits: int, reps: int):
qc = QuantumCircuit(num_qubits)
for _ in range(reps):
qc.h(range(num_qubits))
for i in range(0, num_qubits - 1, 2):
qc.cx(i, i + 1)
qc.measure_all()
return qc
def bench_sampler_v1(qc: QuantumCircuit):
print("\nBackendSamplerV1")
sampler = BackendSampler(backend)
print(f"{timeit(lambda: sampler.run([qc] * num_copies, shots=shots).result(), number=1)} sec")
def bench_sampler_v2(qc: QuantumCircuit):
print("\nBackendSamplerV2")
sampler = BackendSamplerV2(backend=backend)
print(f"{timeit(lambda: sampler.run([qc] * num_copies, shots=shots).result(), number=1)} sec")
qc = gen_circuit(5, 5)
pm = generate_preset_pass_manager(optimization_level=2, backend=backend)
qc2 = pm.run(qc)
bench_sampler_v1(qc2)
bench_sampler_v2(qc2) output (main branch of Qiskit)
|
I'm working on a PR to address this issue. Qiskit/qiskit#12291 |
Since Qiskit/qiskit#12291 was merged, we need to port it here too. |
Describe the bug
A user has noticed a dramatic slowdown in SamplerV2 with FakeSherbrooke:
A demo of ~150 4-qubit circuits has gone from taking a few seconds to over 3 minutes on their machine.
Switching to 5-qubit FakeManila speeds this back up.
Multi-threading things taking a long time in a quick timing profile. Transpiler or aer related?
Steps to reproduce
@caleb-johnson may be able to produce one. Otherwise, try something as above.
Expected behavior
Small circuits on a large device are just as fast as those same circuits on a small device.
The text was updated successfully, but these errors were encountered: