-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hardware inverse transformer faster #6173
Comments
I have a quick fix - #6174, that reduces the time to half, but I think it's still not good enough. Timing before my PR:
Timing after my PR:
Now, each of the transformer steps takes roughly ~0.5 seconds and speeding this up further would probably involve speeding up the |
The example code for 150 qubits takes the following time: On master
On #6174
@AlMrvn What should be "a reasonable time" for the transformer up to ~150ish qubits? Is <1 second a good target to aim for? |
@tanujkhattar that a decent increase! factor 2 already! Thanks a lot! I am already happy =D for a what would be reasonable, I would say that these circuit take few microsecond to be run on the hardware for 1 shot. Let's say 4us per shot with some overhead. it will need a lot of shot, so let's say 1 million shots. this give us a run-time on hardware of ~4s. ideally the circuit creation would be small compared to this. so yeah 1s or 0.5s seems good. for order of magnitude, I usually precompute ~100 of these circuits |
xref #6097 |
@AlMrvn @dstrain115 After my last set of improvements as discussed in the comment above, the original transformer written by you using transformer primitives in the original issue was taking
We also discussed that we can significantly speed up the transformer by completely rewriting it and not using any of the built-in transformer primitives. The implementation we came up with (offline) was as follows (given in the details block). def merge_func(m1: 'cirq.Moment', m2: 'cirq.Moment') -> cirq.Moment:
"""Assumes both m1 and m2 are moments with single qubit gates."""
ret_ops = []
for q in m1.qubits | m2.qubits:
op1, op2 = m1.operation_at(q), m2.operation_at(q)
if op1 and op2:
mat = cirq.unitary(op2) @ cirq.unitary(op1)
gate = cirq.single_qubit_matrix_to_phxz(mat, atol=1e-8)
if gate:
ret_ops.append(gate(q))
else:
op = op1 or op2
assert op is not None
if isinstance(op.gate, cirq.PhasedXZGate):
ret_ops.append(op)
else:
gate = cirq.single_qubit_matrix_to_phxz(cirq.unitary(op), atol=1e-8)
if gate:
ret_ops.append(gate(q))
return cirq.Moment(ret_ops)
def is_single_ops_moment(m: cirq.Moment):
return len(m) == len(m.qubits)
@cirq.transformer
class InverseSycamoreCircuitFast:
"""Transformer to inverse a circuit with sycamore/fsim(theta=pi/2, phi) gates.
This transformer will unroll the circuit.
This transformer operate in the following steps:
1) inverse the circuit order
2) inverse the singles qubits gates
3) inverse the Sycamore gate using the single qubit gate + sycamore inversion:
(0, 0): ───Syc─── ^-1 ───X───Syc──Z(phi)──
│ = |
(0, 1): ───Syc─── ───────Syc────X──────
4) merge the singles qubits moments that are adjacents.
Example:
>>> gate = cirq_google.SYC
>>> pair = cirq.GridQubit.rect(1, 2)
>>> a,b = pair
>>> phi_dict = {pair: np.pi/6}
>>> circuit = cirq.Circuit.from_moments(cirq.X(a), gate.on(*pair), cirq.Y(b))
>>> transformer = InverseSycamoreCircuit(gate = gate, phi = phi_dict)
>>> inverse_circuit = transformer(circuit)
"""
def __init__(
self,
gate: cirq.Gate,
phi_mapping: Mapping[tuple[cirq.Qid, ...], float],
merge_sq_moments: bool = True,
):
"""
Args:
gate: the gate to inverse. Usually a iSWAP-like gate/sycamore gate,
phi_mapping: dictionnary of the parameter phi per pair,
merge_sq_moment: the transformed circuit will have merged sq gates.
"""
self.gate = gate
self.phi_mapping = phi_mapping
def sycamore_inverse_map_func(self, op: cirq.Operation) -> cirq.OP_TREE:
"""Constructing the inverse of the sycamore from the sycamore gate."""
assert op.gate == self.gate, f"circuit should follow a brickwall pattern with all 2q gates same as {self.gate}"
z = self.phi_mapping[op.qubits] / np.pi
yield cirq.Moment(cirq.X(op.qubits[0]))
yield cirq.Moment(self.gate(*op.qubits))
yield cirq.Moment(cirq.X(op.qubits[1]), (cirq.Z**z).on(op.qubits[0]))
def __call__(self, circuit: cirq.AbstractCircuit, *, context=None) -> cirq.AbstractCircuit:
new_moments = []
for m in circuit[::-1]:
if is_single_ops_moment(m):
# m acts only on single qubit gates
if new_moments and is_single_ops_moment(new_moments[-1]):
new_moments[-1] = merge_func(new_moments[-1], cirq.inverse(m))
else:
new_moments.append(cirq.inverse(m))
else:
inv_circuits = [cirq.Circuit(self.sycamore_inverse_map_func(op)) for op in m]
inv_moments = cirq.Circuit.zip(*inv_circuits).moments
new_moments[-1] = merge_func(new_moments[-1], inv_moments[0])
new_moments.extend(inv_moments[1:])
return cirq.Circuit(new_moments) The fast version has a runtime of ~1.7 seconds, i.e. running start = time.time()
inversing_syc_fast(circuit)
end = time.time()
print(f"transformer for hardware inverse fast: {end - start}") gives
The ~1.7 second time gets rid of most of the overhead introduced by the built-in circuit transformer routines, like Now, as part #6250, I've improved the speed of built-in
You can see that the Note that all of these numbers are for 150 qubit circuits. For 75 qubits, the total runtime < 1sec. I think we can safely close this issue now once the linked PR is merged? |
Summarize the task
Transformers implemented naively can be quite slow. In the snippet I am giving with a inversion of a circuit using the fact that the inverse of the Sycamore gate can be decomposed into single qubits gates + sycamore gate take 100x longer than doing the cirq.inverse function on the circuit.
If there was a way to make this better, in cirq directly, or maybe give good practice on how to make these transfomer fast. it would be nice.
Acceptance criteria - when is the task considered done?
reasonable time for the specific transformer up to ~150ish qubits.
** Python Script **
The text was updated successfully, but these errors were encountered: