Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing runtime parameter binding #1901

Merged
merged 23 commits into from
Oct 6, 2023

Conversation

doichanj
Copy link
Collaborator

Summary

This is new feature to optimize GPU simulation for single circuit with multiple parameters.
Before this PR, multiple circuits were given by binding parameters from a single circuit. This PR bind parameters to each gates at runtime on a single circuit with multiple shots of simulations, so we can speed up by combining batched execution on GPU.
image

Details and comments

Runtime parameter binding puts list of parameters to each operation and parameter is selected for each shot at runtime.
This option changes parallelization from multi-experiments to multi-shots.
image

@doichanj doichanj requested a review from hhorii August 22, 2023 07:13
@doichanj doichanj added the enhancement New feature or request label Aug 22, 2023
@doichanj doichanj added this to the Aer 0.13.0 milestone Aug 22, 2023
@mtreinish mtreinish added the Changelog: New Feature Include in the Added section of the changelog label Aug 22, 2023
@Guogggg
Copy link

Guogggg commented Sep 6, 2023

I downloaded your code, and compiled from qiskit-aer source, and set up

noiseless_estimator = AerEstimator(
    backend_options={"method": "statevector","device":"GPU","runtime_parameter_bind_enable":True,"batched_shots_gpu":True},
    run_options={"shots": None},
    approximation=True,
)

However, there is no significant acceleration in GPU operation time, so here is my code

import numpy as np
import sys

from qiskit_nature.units import DistanceUnit
from qiskit_nature.second_q.drivers import PySCFDriver
from qiskit_nature.second_q.mappers import JordanWignerMapper, ParityMapper
from qiskit.algorithms.minimum_eigensolvers import VQE
from qiskit.algorithms.minimum_eigensolvers import MinimumEigensolver
from qiskit.algorithms.optimizers import SLSQP
from qiskit.algorithms.optimizers import SPSA
from qiskit.algorithms.optimizers import L_BFGS_B

from qiskit.primitives import BackendEstimator
from qiskit_aer import AerSimulator
from qiskit_nature.second_q.circuit.library import HartreeFock, UCCSD
from qiskit_nature.second_q.algorithms import GroundStateEigensolver
import time
from qiskit_aer.primitives import Estimator as AerEstimator
from qiskit_nature.settings import  QiskitNatureSettings
QiskitNatureSettings.use_pauli_sum_op = False
print(sys.path)
dis = 1.5
driver = PySCFDriver(
        atom = "Li 0 0 0; H 0 0 " + str(dis),
        basis = "sto3g",
        charge = 0,
        spin = 0,
        unit = DistanceUnit.ANGSTROM,
    )
     
es_problem = driver.run()

mapper_jw = ParityMapper()

ansatz = UCCSD(es_problem.num_spatial_orbitals,
		es_problem.num_particles,
                mapper_jw,
                initial_state=HartreeFock(es_problem.num_spatial_orbitals,
                                           es_problem.num_particles,
                                           mapper_jw),
                )

noiseless_estimator = AerEstimator(
    backend_options={"method": "statevector","device":"GPU","runtime_parameter_bind_enable":True,"batched_shots_gpu":True},
    run_options={"shots": None},
    approximation=True,
)
vqe_solver = VQE(
    noiseless_estimator, ansatz, optimizer=SLSQP()
)

vqe_solver.initial_point = [0] * ansatz.num_parameters
st = time.time()
fermionic_op = es_problem.hamiltonian.second_q_op()
qubit_op = mapper_jw.map(fermionic_op)    # 12-qubit
result = vqe_solver.compute_minimum_eigenvalue(qubit_op)

calc = GroundStateEigensolver(mapper_jw, vqe_solver)
res = calc.solve(es_problem)
energy = result.eigenvalue + res.nuclear_repulsion_energy
print(energy)
print("time used: ",time.time()-st)


@doichanj

@doichanj
Copy link
Collaborator Author

I tested script above and time comparison is as followings;

  • CPU w/o runtime parameter binding : 454.241402387619
  • CPU w/runtime parameter binding : 476.4489688873291
  • GPU w/o runtime parameter binding : 380.32021927833557
  • GPU w/runtime parameter binding : 359.3335130214691

I observed small speedup by using runtime parameter binding on GPU, but I think most of the time is spent outside of simulator in this case.

@Guogggg
Copy link

Guogggg commented Sep 12, 2023

I tested script above and time comparison is as followings;

  • CPU w/o runtime parameter binding : 454.241402387619
  • CPU w/runtime parameter binding : 476.4489688873291
  • GPU w/o runtime parameter binding : 380.32021927833557
  • GPU w/runtime parameter binding : 359.3335130214691

I observed small speedup by using runtime parameter binding on GPU, but I think most of the time is spent outside of simulator in this case.
Is there some other way we can make compute_minimum_eigenvalue faster, and I think it's not parallel, I set "max_parallel_threads":14,"statevector_parallel_threshold":6 in the source code, so that it takes more core, but the computation time is longer

Copy link
Collaborator

@hhorii hhorii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a major change to the existing code is introducing ResultItr and it looks reasonable. Though I was not able to check the codes completely, added/modified codes look good. If they passed all the existing codes, merging this PR has no problem (assuming fixing bugs when they are found).

I left some very minor comments. If they are fixed, this PR looks good to be merged.

dump_vqe Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@@ -318,6 +318,12 @@ class AerSimulator(AerBackend):
* ``accept_distributed_results`` (bool): This option enables storing
results independently in each process (Default: None).

* ``runtime_parameter_bind_enable`` (bool): If this option is True
parameters are binded at runtime by using multi-shots without constructing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binded -> bound because other descriptions use the latter in Aer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -940,6 +943,30 @@ inline Op make_qerror_loc(const reg_t &qubits, const std::string &label,
return op;
}

// make new op by parameter binding
inline Op make_parameter_bind(const Op &src, const uint_t iparam,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method name is bind_parameters().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed name to bind_parameter because this function bind single parameter

@hhorii hhorii merged commit e1332f8 into Qiskit:main Oct 6, 2023
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: New Feature Include in the Added section of the changelog enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants