From b77566c4bcbc6f72a12cd44023b584db6936f1b3 Mon Sep 17 00:00:00 2001
From: Zoufalc <40824883+Zoufalc@users.noreply.github.com>
Date: Sat, 28 Nov 2020 05:38:40 +0100
Subject: [PATCH] Fix probability Hessians with the parameter shift method.
 (#1367)

* clean PR with the current gradient state

* Update derivatives_base.py

* Update hessian.py

* Update test_grad.py

* add notebooks

* Typing, Linting

* Update gradient_framework.ipynb

* typing qfi

* enable gradients for VQAlgorithms

* remove old files and notes

* enfoce num chars <= 100

* fix an import

* re-adjust import

* fix other imports

* additional base classes

* test cases pass and the core logic all works. Doc strings could use another look. Same goes for some of the class method names inside of LinCombQFI and OverlapQFI

* Update derivatives_base.py (#75)

* Update derivatives_base.py

Added some doc-strings to several methods inside of derivatives_base.py. Also simplified the logic of `erase_operator_coeffs` by removing some leftover logic from previous implementations.

* Update qiskit/aqua/operators/gradients/derivatives_base.py

Co-authored-by: Julien Gacon <jul@zurich.ibm.com>

* update ordering/naming

* reshuffle, rename

* add VQE gradient unittest

* add notebook B2Meeting

This notebook will be removed after today - sorry but this is the best way to share this notebook with @bryce

* Update gradient_framework_reduced.ipynb

* removed a leftover notebook file

* removed the notebooks (and am pushing them to a cleaned up QGradients repo)

* Update gradient.py

This small change prevents trivial/zero operators from appearing in SummedOps during product rule computations.

* will be deleted need this to pull sry

* fix qfi docstring

* Add retworkx version check

* delete notebooks, fixed seed fin_diff circSampler, fix kwargs import error

* Delete test_natural_gradient.py

* fix symbol map in parameter expression grad

* typing

* Update qfi.py

* typing

* Update test_grad.py

* added a docstring to unroll_operator

* Update hessian.py

* Update gradient.py

* Fix statevector check (not always a QuantumInstance)

* Update hessian.py

* fix lint of qiskit/aqua/operators/gradients/*.py

* fix lint of qiskit/aqua/operators/gradients/circuit_gradients/*.py

* fix lint of qiskit/aqua/operators/gradients/circuit_qfis/*.py

* fix test_grad

* Copy folder

* remove unknown aqua changes

* fix ListOp, VQAlgorithms for gradients

* VQE Gradients import

* update operators init

* rm some changes not related to gradients

* mypy fixes

* skip jax test if not installed

* try to fix cyclic import

* fix lint in isinstance

* uncomment vqe tests + consistent HAS_JAX

* mypy fixes

* ignore mypy warning from opbase.coeff

* fix isinstance used w/ typehints

* try fixing sphinx

* ignore cases mypy doesn't understand correctly

* fix unnecessary else after raise

* Add jax to github actions

* fix spell

* add jax dependency to actions lint

* fix spell

* Reduce VQE Iterations in unittest

* Use ImportError

* fix lint

* Add gradient framework release note

* update gradient framework release note

* fix reno formatting errors

* Update to documentation

* Fix doctring to avoid warnings

* update gradient init

* update qaoa docstring

* remover retworkx check

* update docstrings

* lint fixes

* update docstrings to hint max_evals_grouped deprecation if gradient is given

* Update test_grad.py

Several high-level function calls still used a `method=...` argument in the initialization; however, we changed this to `grad_method`, `hess_method`, `qfi_method`.
So I updated these arguments.

This was causing some tests to not be carried out as the default argument was inferred when the correct argument name was not assigned a method type.

* Update test_grad.py

Added a test case for Hessian.py that forces it to differentiate an operator with a custom combo_fn

* Update lin_comb circuit gradients to fix hessian evaluation

* Move renamed gradient test into operator file

* update lin_comb

* update lin_comb gradients

* fix lint

* fix lin_comb

* fixed the bugs that were causing Hessian logic to fail with custom combo_fns

* increase coverage of natural gradient

* merge zoufalc gradients

* try all combinations of grad_method and qfi_method

* style fix

* updates for hessians

* remove redundant line

* make fixes

* remove hessian from grad

* relax VQE test

* remove redundant lambda

* fix identation

* fix spelling

* fix whitespace

* revert qfi_method in test_natural_gradient

* revert param name

* fix lint

* remove redundant comment

* update identation

* fix np.random.seed

* increase coverage for lin_comb

* fix lint

* lint

* whitespace fix

* docstring

* lint

* disable misspelling

* add yy, zz to pylintdict

* Update __init__.py

updated the gradients  __init__ file with some of the content from the release note

* fixed linting

* update init file

* fix spell

* fix spelling

* remove redundant import

* attempt to fix sphinx #1

* Update qiskit/aqua/operators/gradients/qfi.py

Co-authored-by: Julien Gacon <gaconju@gmail.com>

* try to fix sphinx no.2

* add words to pylintdict

* fix cryoris comments

* Update qiskit/aqua/operators/gradients/circuit_qfis/lin_comb_full.py

Co-authored-by: Julien Gacon <gaconju@gmail.com>

* Update qiskit/aqua/operators/gradients/circuit_qfis/lin_comb_full.py

Co-authored-by: Julien Gacon <gaconju@gmail.com>

* Update qiskit/aqua/operators/gradients/circuit_qfis/lin_comb_full.py

Co-authored-by: Julien Gacon <gaconju@gmail.com>

* cryoris comments

* delete unneccessary

* Update qiskit/aqua/operators/gradients/hessian.py

Co-authored-by: Julien Gacon <gaconju@gmail.com>

* raise warning hessians

* include docstring TypeError

* Update qiskit/aqua/operators/gradients/circuit_qfis/overlap_diag.py

Co-authored-by: Julien Gacon <gaconju@gmail.com>

* Update qiskit/aqua/operators/gradients/gradient.py

Co-authored-by: Julien Gacon <gaconju@gmail.com>

* remove unnecessary os comment

* Update qiskit/aqua/operators/gradients/circuit_gradients/lin_comb.py

Co-authored-by: Julien Gacon <gaconju@gmail.com>

* Insert proper Error

* apply changes from code review

* Fix probability hessians parameter shift

* change type hint

* update docstrings

* fix typos

Co-authored-by: Cryoris <jules.gacon@googlemail.com>
Co-authored-by: Takashi Imamichi <imamichi@jp.ibm.com>
Co-authored-by: Bryce-Fuller <Bryce.Fuller@ibm.com>
Co-authored-by: Julien Gacon <jul@zurich.ibm.com>
Co-authored-by: Manoel Marques <manoel.marques@ibm.com>
Co-authored-by: Manoel Marques <manoel@us.ibm.com>
Co-authored-by: woodsp <woodsp@us.ibm.com>
Co-authored-by: Steve Wood <40241007+woodsp-ibm@users.noreply.github.com>
Co-authored-by: Julien Gacon <gaconju@gmail.com>
---
 .../circuit_gradients/param_shift.py          | 33 ++++++++++---
 .../gradients/circuit_qfis/lin_comb_full.py   | 49 +++++++++++++------
 qiskit/aqua/operators/gradients/hessian.py    |  7 +--
 .../operators/gradients/natural_gradient.py   |  9 +++-
 test/aqua/operators/test_gradients.py         |  5 +-
 5 files changed, 74 insertions(+), 29 deletions(-)

diff --git a/qiskit/aqua/operators/gradients/circuit_gradients/param_shift.py b/qiskit/aqua/operators/gradients/circuit_gradients/param_shift.py
index f9e09873a4..31ad0fdbe2 100644
--- a/qiskit/aqua/operators/gradients/circuit_gradients/param_shift.py
+++ b/qiskit/aqua/operators/gradients/circuit_gradients/param_shift.py
@@ -197,19 +197,24 @@ def _parameter_shift(self,
 
                 p_param = pshift_gate.params[param_index]
                 m_param = mshift_gate.params[param_index]
-
+                # For analytic gradients the circuit parameters are shifted once by +pi/2 and
+                # once by -pi/2.
                 if self.analytic:
                     shift_constant = 0.5
                     pshift_gate.params[param_index] = (p_param + (np.pi / (4 * shift_constant)))
                     mshift_gate.params[param_index] = (m_param - (np.pi / (4 * shift_constant)))
+                # For finite difference gradients the circuit parameters are shifted once by
+                # +epsilon and once by -epsilon.
                 else:
                     shift_constant = 1. / (2 * self._epsilon)
                     pshift_gate.params[param_index] = (p_param + self._epsilon)
                     mshift_gate.params[param_index] = (m_param - self._epsilon)
-
+                # The results of the shifted operators are now evaluated according the parameter
+                # shift / finite difference formula.
                 if isinstance(operator, ComposedOp):
                     shifted_op = shift_constant * (pshift_op - mshift_op)
-
+                # If the operator represents a quantum state then we apply a special combo
+                # function to evaluate probability gradients.
                 elif isinstance(operator, StateFn):
                     shifted_op = ListOp(
                         [pshift_op, mshift_op],
@@ -261,9 +266,16 @@ def get_primitives(item):
                 item = item.primitive.data
             return item
 
-        if isinstance(x, Iterable):
+        is_statefn = False
+        if isinstance(x, list):
+            # Check if all items in x are a StateFn items
+            if all(isinstance(item, StateFn) for item in x):
+                is_statefn = True
             items = [get_primitives(item) for item in x]
         else:
+            # Check if x is a StateFn item
+            if isinstance(x, StateFn):
+                is_statefn = True
             items = [get_primitives(x)]
         if isinstance(items[0], dict):
             prob_dict: Dict[str, float] = {}
@@ -273,8 +285,17 @@ def get_primitives(item):
                                      shift_constant * ((-1) ** i) * prob_counts
             return prob_dict
         elif isinstance(items[0], Iterable):
-            return shift_constant * np.subtract(np.multiply(items[0], np.conj(items[0])),
-                                                np.multiply(items[1], np.conj(items[1])))
+            # If x was given as StateFn the state amplitudes need to be multiplied in order to
+            # evaluate the sampling probabilities which are then subtracted according to the
+            # parameter shift rule.
+            if is_statefn:
+                return shift_constant * np.subtract(np.multiply(items[0], np.conj(items[0])),
+                                                    np.multiply(items[1], np.conj(items[1])))
+            # If x was not given as a StateFn the state amplitudes were already converted into
+            # sampling probabilities which are then only subtracted according to the
+            # parameter shift rule.
+            else:
+                return shift_constant * np.subtract(items[0], items[1])
         raise TypeError(
             'Probability gradients can only be evaluated from VectorStateFs or DictStateFns.')
 
diff --git a/qiskit/aqua/operators/gradients/circuit_qfis/lin_comb_full.py b/qiskit/aqua/operators/gradients/circuit_qfis/lin_comb_full.py
index 4dfccac7f5..2510c3354a 100644
--- a/qiskit/aqua/operators/gradients/circuit_qfis/lin_comb_full.py
+++ b/qiskit/aqua/operators/gradients/circuit_qfis/lin_comb_full.py
@@ -63,10 +63,12 @@ def convert(self,
         phase_fix_observable = ~StateFn((X + 1j * Y) ^ (I ^ operator.num_qubits))
         # see https://arxiv.org/pdf/quant-ph/0108146.pdf
 
+        # Check if the given operator corresponds to a quantum state given as a circuit.
         if not isinstance(operator, CircuitStateFn):
             raise TypeError(
                 'LinCombFull is only compatible with states that are given as CircuitStateFn')
 
+        # If a single parameter is given wrap it into a list.
         if not isinstance(params, (list, np.ndarray)):
             params = [params]
         state_qc = operator.primitive
@@ -74,15 +76,21 @@ def convert(self,
         # First, the operators are computed which can compensate for a potential phase-mismatch
         # between target and trained state, i.e.〈ψ|∂lψ〉
         phase_fix_states = None
+        # Add working qubit
         qr_work = QuantumRegister(1, 'work_qubit')
         work_q = qr_work[0]
         additional_qubits: Tuple[List[Qubit], List[Qubit]] = ([work_q], [])
-        # create a copy of the original state with an additional work_q register
         for param in params:
+            # Get the gates of the given quantum state which are parameterized by param
             param_gates = state_qc._parameter_table[param]
+            # Loop through the occurrences of param in the quantum state
             for m, param_occurence in enumerate(param_gates):
+                # Get the coefficients and gates for the linear combination gradient for each
+                # occurrence, see e.g. https://arxiv.org/abs/2006.06004
                 coeffs_i, gates_i = LinComb._gate_gradient_dict(param_occurence[0])[
                     param_occurence[1]]
+                # Construct the quantum states which are then evaluated for the respective QFI
+                # element.
                 for k, gate_to_insert_i in enumerate(gates_i):
                     grad_state = state_qc.copy()
                     grad_state.add_register(qr_work)
@@ -90,7 +98,7 @@ def convert(self,
                     # apply Hadamard on work_q
                     LinComb.insert_gate(grad_state, param_occurence[0], HGate(),
                                         qubits=[work_q])
-                    # Fix work_q phase
+                    # Fix work_q phase such that the gradient is correct.
                     coeff_i = coeffs_i[k]
                     sign = np.sign(coeff_i)
                     is_complex = np.iscomplex(coeff_i)
@@ -141,17 +149,21 @@ def convert(self,
                                             gate_to_insert_i,
                                             additional_qubits=additional_qubits)
 
+                    # Remove unnecessary gates.
                     grad_state = self.trim_circuit(grad_state, param_occurence[0])
-
+                    # Apply the final Hadamard on the working qubit.
                     grad_state.h(work_q)
-
+                    # Add the coefficient needed for the gradient as well as the original
+                    # coefficient of the given quantum state.
                     state = np.sqrt(np.abs(coeff_i)) * operator.coeff * CircuitStateFn(grad_state)
 
-                    # Chain Rule parameter expressions
+                    # Check if the gate parameter corresponding to param is a parameter expression
                     gate_param = param_occurence[0].params[param_occurence[1]]
                     if gate_param == param:
                         state = phase_fix_observable @ state
                     else:
+                        # If the gate parameter is a parameter expressions the chain rule needs
+                        # to be taken into account.
                         if isinstance(gate_param, ParameterExpression):
                             expr_grad = DerivativeBase.parameter_expression_grad(gate_param, param)
                             state = (expr_grad * phase_fix_observable) @ state
@@ -161,7 +173,9 @@ def convert(self,
                     if m == 0 and k == 0:
                         phase_fix_state = state
                     else:
+                        # Take the product rule into account
                         phase_fix_state += state
+            # Create a list for the phase fix states
             if not phase_fix_states:
                 phase_fix_states = [phase_fix_state]
             else:
@@ -169,12 +183,14 @@ def convert(self,
 
         # Get  4 * Re[〈∂kψ|∂lψ]
         qfi_operators = []
+        # Add a working qubit
         qr_work_qubit = QuantumRegister(1, 'work_qubit')
         work_qubit = qr_work_qubit[0]
         additional_qubits = ([work_qubit], [])
         # create a copy of the original circuit with an additional work_qubit register
         circuit = state_qc.copy()
         circuit.add_register(qr_work_qubit)
+        # Apply a Hadamard on the working qubit
         LinComb.insert_gate(circuit, state_qc._parameter_table[params[0]][0][0], HGate(),
                             qubits=[work_qubit])
 
@@ -182,17 +198,16 @@ def convert(self,
         for i, param_i in enumerate(params):  # loop over parameters
             qfi_ops = None
             for j, param_j in enumerate(params):
-                # Construct the circuits
+                # Get the gates of the quantum state which are parameterized by param_i
                 param_gates_i = state_qc._parameter_table[param_i]
                 for m_i, param_occurence_i in enumerate(param_gates_i):
                     coeffs_i, gates_i = LinComb._gate_gradient_dict(param_occurence_i[0])[
                         param_occurence_i[1]]
-                    # apply Hadamard on work_qubit
 
                     for k_i, gate_to_insert_i in enumerate(gates_i):
                         coeff_i = coeffs_i[k_i]
+                        # Get the gates of the quantum state which are parameterized by param_j
                         param_gates_j = state_qc._parameter_table[param_j]
-
                         for m_j, param_occurence_j in enumerate(param_gates_j):
                             coeffs_j, gates_j = \
                                 LinComb._gate_gradient_dict(param_occurence_j[0])[
@@ -204,7 +219,8 @@ def convert(self,
                                 qfi_circuit = QuantumCircuit(*circuit.qregs)
                                 qfi_circuit.data = circuit.data
 
-                                # Fix work_qubit phase
+                                # Correct the phase of the working qubit according to coefficient i
+                                # and coefficient j
                                 sign = np.sign(np.conj(coeff_i) * coeff_j)
                                 is_complex = np.iscomplex(np.conj(coeff_i) * coeff_j)
                                 if sign == -1:
@@ -230,7 +246,7 @@ def convert(self,
                                                     XGate(),
                                                     qubits=[work_qubit])
 
-                                # Insert controlled, intercepting gate - controlled by |1>
+                                # Insert controlled, intercepting gate i - controlled by |1>
                                 if isinstance(param_occurence_i[0], UGate):
                                     if param_occurence_i[1] == 0:
                                         LinComb.insert_gate(
@@ -267,7 +283,7 @@ def convert(self,
                                     qfi_circuit, gate_to_insert_i,
                                     XGate(), qubits=[work_qubit], after=True)
 
-                                # Insert controlled, intercepting gate - controlled by |0>
+                                # Insert controlled, intercepting gate j - controlled by |0>
                                 if isinstance(param_occurence_j[0], UGate):
                                     if param_occurence_j[1] == 0:
                                         LinComb.insert_gate(
@@ -310,21 +326,26 @@ def convert(self,
                                     qfi_circuit = self.trim_circuit(
                                         qfi_circuit, param_occurence_j[0]
                                     )
-
+                                # Apply final Hadamard gate
                                 qfi_circuit.h(work_qubit)
-                                # Convert the quantum circuit into a CircuitStateFn
+                                # Convert the quantum circuit into a CircuitStateFn and add the
+                                # coefficients i, j and the original operator coefficient
                                 term = np.sqrt(np.abs(coeff_i) * np.abs(coeff_j)) * operator.coeff
                                 term = term * CircuitStateFn(qfi_circuit)
 
-                                # Chain Rule Parameter Expression
+                                # Check if the gate parameters i and j are parameter expressions
                                 gate_param_i = param_occurence_i[0].params[param_occurence_i[1]]
                                 gate_param_j = param_occurence_j[0].params[param_occurence_j[1]]
 
                                 meas = deepcopy(qfi_observable)
+                                # If the gate parameter i is a parameter expression use the chain
+                                # rule.
                                 if isinstance(gate_param_i, ParameterExpression):
                                     expr_grad = DerivativeBase.parameter_expression_grad(
                                         gate_param_i, param_i)
                                     meas *= expr_grad
+                                # If the gate parameter j is a parameter expression use the chain
+                                # rule.
                                 if isinstance(gate_param_j, ParameterExpression):
                                     expr_grad = DerivativeBase.parameter_expression_grad(
                                         gate_param_j, param_j)
diff --git a/qiskit/aqua/operators/gradients/hessian.py b/qiskit/aqua/operators/gradients/hessian.py
index 0e9eb47080..c3d43310b6 100644
--- a/qiskit/aqua/operators/gradients/hessian.py
+++ b/qiskit/aqua/operators/gradients/hessian.py
@@ -241,12 +241,7 @@ def is_coeff_c(coeff, c):
 
         elif isinstance(operator, StateFn):
             if not operator.is_measurement:
-                from .circuit_gradients import LinComb
-                if isinstance(self.hess_method, LinComb):
-                    return self.hess_method.convert(operator, params)
-                else:
-                    raise TypeError('Please set the attribute hess_method to ´lin_comb´ to compute '
-                                    'probability Hessians.')
+                return self.hess_method.convert(operator, params)
 
             else:
                 raise TypeError('The computation of Hessians is only supported for Operators which '
diff --git a/qiskit/aqua/operators/gradients/natural_gradient.py b/qiskit/aqua/operators/gradients/natural_gradient.py
index 10bb16f1b2..88aed414f9 100644
--- a/qiskit/aqua/operators/gradients/natural_gradient.py
+++ b/qiskit/aqua/operators/gradients/natural_gradient.py
@@ -91,22 +91,29 @@ def convert(self,
                             'CircuitStateFn.')
         if not isinstance(params, Iterable):
             params = [params]
+        # Instantiate the gradient
         grad = Gradient(self._grad_method, epsilon=self._epsilon).convert(operator, params)
+        # Instantiate the QFI metric which is used to re-scale the gradient
         metric = self._qfi_method.convert(operator[-1], params) * 0.25
 
+        # Define the function which compute the natural gradient from the gradient and the QFI.
         def combo_fn(x):
             c = np.real(x[0])
             a = np.real(x[1])
             if self.regularization:
+                # If a regularization method is chosen then use a regularized solver to
+                # construct the natural gradient.
                 nat_grad = NaturalGradient._regularized_sle_solver(
                     a, c, regularization=self.regularization)
             else:
                 try:
+                    # Try to solve the system of linear equations Ax = C.
                     nat_grad = np.linalg.solve(a, c)
                 except np.linalg.LinAlgError:  # singular matrix
                     nat_grad = np.linalg.lstsq(a, c)[0]
             return np.real(nat_grad)
-
+        # Define the ListOp which combines the gradient and the QFI according to the combination
+        # function defined above.
         return ListOp([grad, metric], combo_fn=combo_fn)
 
     @property
diff --git a/test/aqua/operators/test_gradients.py b/test/aqua/operators/test_gradients.py
index 2775a24741..a4d54824db 100644
--- a/test/aqua/operators/test_gradients.py
+++ b/test/aqua/operators/test_gradients.py
@@ -467,7 +467,8 @@ def test_prob_grad(self, method):
                 np.testing.assert_array_almost_equal(prob_grad_result,
                                                      correct_values[i][j], decimal=1)
 
-    def test_prob_hess_lin_comb(self):
+    @data('lin_comb', 'param_shift', 'fin_diff')
+    def test_prob_hess(self, method):
         """Test the probability Hessian using linear combination of unitaries method
 
         d^2p0/da^2 = - sin(a)sin(b) / 2
@@ -488,7 +489,7 @@ def test_prob_hess_lin_comb(self):
 
         op = CircuitStateFn(primitive=qc, coeff=1.)
 
-        prob_hess = Hessian(hess_method='lin_comb').convert(operator=op, params=params)
+        prob_hess = Hessian(hess_method=method).convert(operator=op, params=params)
         values_dict = [{a: np.pi / 4, b: 0}, {a: np.pi / 4, b: np.pi / 4},
                        {a: np.pi / 2, b: np.pi}]
         correct_values = [[[0, 0], [1 / (2 * np.sqrt(2)), - 1 / (2 * np.sqrt(2))]],