-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add VF2PostLayout pass #7862
Add VF2PostLayout pass #7862
Conversation
This commit adds a new transpiler VF2PostLayout and adds a new phase/model to the preset transpiler pipeline post-layout/qubit selection. The idea is based on the mapomatic project [1] which took the code from the existing VF2Layout pass to find an isomorphic subgraph in the coupling graph after transpilation which had better noise characteristics than those initial selected as part of the initial layout phase. Doing post transpile qubit selection provides the pass more information because we can assume that the circuit's operations are in the target basis and that there is at least 1 subgraph already in the coupling graph because we've gone through routing. This enables us to look at the specific error rates for each instruction and weigh the sum of error rates for the mapped circuit on each potential qubit mapping to find the best performing set of qubits for a given circuit. Initial layout doesn't have access to this information because at the beginning of the circuit we aren't necessarily going to find a perfect mapping and we're not guaranteed to be in the target basis. So running post layout may yield quality improvements even if we found an initial perfect layout using VF2Layout. While this new pass is very similar to VF2Layout pass as it builds an interaction graph representing the 2q interactions in the circuit and uses retworkx's vf2_mapping() function to find all isomorphic subgraphs in the coupling graph it behaves diffferently. This is a separate pass because it performs the search a bit differently. First the interaction graphs are annotated with the gate counts on each qubit and edge which is used to completely apply a heuristic score to the circuit and secondly in the case of a target we verify the nodes and edges are feasible in the subgraph isomorphism check since a target can have operations defined on a subset of bits. Additionally the scoring heursitic checks the sum of the error rates for each gate on the mapped qubits. The preset pass managers are updated to use this new pass at the end of the transpile and apply a layout if a solution is found. [1] https://github.com/Qiskit-Partners/mapomatic
The matching callback function had a typo so it was always returning True on edge comparisons even if the target coupling graph edge was not a superset of the local gates in the interaction graph. This commit fixes the oversight so such cases are correctly rejected as a viable subgraph isomorphic graph.
If we find a a better layout using post layout and there are ancillas in the original layout those would previously be lost. This commit fixes this by detecting when we're missing qubits in the new layout and adding the ancillas on unused qubits in the coupling graph.
Applying a new layout after we schedule a circuit would invalidate that scheduling. This commit moves the post layout pass to run prior to scheduling in the preset passmanagers.
This commit modifies the ApplyLayout pass to enable slightly altered behavior when applying a post layout ontop of a circuit that's already had a layout applied. Previously we overwrote the layout and apply layout just blindly applied the layout, this caused us to lose the original bit and register context as that only exists as metadata in the property set's layout field. To ensure we preserve the mapping from the initial virtual bits through our second round of layout apply layout is modified to handle doing this mapping for us by passing the new layout separately as a new field in the property set.
If initial_layout or layout_method are set in transpile() do not run post layout as this will produce unexpected results for users. If you're manually specifying a layout method that should be what is used only and we shouldn't do any other reordering to try and optimize beyond what the user requested.
After introducing VF2PostLayout the output layout of the circuit is potentially different depending on the noise characteristics of the target backend. This was causing test failures on tests that were explicitly checking for an exact layout output from transpile. This commit updates the expected layouts in those tests to match the new behavior of the transpiler. Most of these tests were actually already using VF2Layout to find a perfect layout, but with vf2layout the transpiler is finding an alternative layout post optimization which has better noise characteristics for the circuit being run.
Pull Request Test Coverage Report for Build 2229815468
💛 - Coveralls |
This commit fixes a typo in the node/edge match function used in matching a subgraph to a target over valid operation names. Previously the reverse condition check was incorrectly checking that the target operations on a 2q edge were a subset of the circuit operations when it should have been checking the reverse condition. This commit fixes this oversight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is nice.
I think we should actually remove the NoiseAwareLayout pass after this. That pass doesn't work as intended anyway because the swap minimization part from the associated paper is not implemented, so the results are often bad. I like this new approach since the swap minimization part is solved on a subgraph and then the subgraph is embedded. The NoiseAwareLayout might be faster (since finding graph isomorphism is hard), but we can optimize when we hit a problem in scaling.
mappings = vf2_mapping( | ||
cm_graph, | ||
im_graph, | ||
node_matcher=_target_match, | ||
edge_matcher=_target_match, | ||
subgraph=True, | ||
id_order=False, | ||
induced=False, | ||
call_limit=self.call_limit, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this return ALL subgraphs and then we score them here? this should be pretty expensive right?
Also say we have a line of 5 qubits and we have chose subset 10 to 1., there are two possibilities based on the orientation of layout. It could be either 10-11-12-13-14
or 14-13-12-11-10
. Does it return them both and score separately, or just one? For say a ring of 12 on heavy-hex, there would be 12 orientations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does mappings
here is an iterator and each step will compute a new isomorphic mapping. It can be slow and that is why we set call_limit
which is the number of internal state visits the function will try so we limit the amount of time we try waiting. In the preset passmanagers we set this at increasing large number so that we spend at most ~100ms for level1, ~10sec for level 2, and ~60 sec for level 3 (we also check on each iteration that we haven't gone over a timeout parameter and break if we have).
As for the orientation it will try both because it is a directed graph and we're using strict edges. This is actually necessary especially for the backendv2/target path because in that path we might not have all gates available in both directions (which is what the matcher functions here are checking). Also the scores can be different because we'd potentially end up with different gate counts on each qubit and 2q link which would change the score between each orientation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i see, it's looking for directed subgraphs. I think this can be problematic. Suppose I have a circuit that has been laid out on qubits 0 ---> 1 ---> 2
in this direction. So the circuit would be:
----*------
|
----X--*---
|
-------X---
Now if these qubits are very noisy but qubits 3 --> 4 <--- 5
are very good, they will not be chosen because they don't have the right direction. But actually fixing direction is trivial. I suggest that choosing layouts should be on undirected graphs (only find good subsets). Then apply a post-post-layout direction fixing pass.
For the case I brought up which is when there are multiple orientations (2 for a line, 12 for a ring of 12), I think this can be an interesting follow-up of choosing the best among those. But already choosing the best subset among many will go a long way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I considered using undirected edges and using GateDirection
when I wrote this, the problem with it is to do an undirected subgraph search we'd basically need to rerun the transpiler after the routing phase again. We can easily do GateDirection
after applying the layout to fix this, but then we've potentially injected a bunch of hadamards into the circuit so we need the basis translator to convert that to the native basis set. This will then require the optimization passes to run because the basis translator output can likely be simplified. We basically end up rerunning most of the transpiler at that point. Typing all of this now though has made met think of a potentially interesting follow on we can play with where we could add a strict direction flag to this pass and then add VF2PostLayout
and ApplyLayout
to end of the optimization loop with that flag set False
.
The way I was viewing this pass was given the hard constraints on the backend can we find a better qubit selection with lower noise and if not we don't do anything. So we do miss the opportunity for 3 -> 4 < - 5
if there are no compatible 2q gates on that direction but it is just a heuristic and that seemed ok . Especially with BackendV2
where the gates are defined per qubit (like in your example if 0 -> 1 -> 2
was all in cx but 3 -> 4 <- 5
was only ecr). This seemed the better path to start since in all my tests it was able to find better layouts. At least for all the current backends with connectivity constraints this won't come up since they all currently define bidirectional edges with the same error rates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We basically end up rerunning most of the transpiler at that point.
If we order the passes right I don't think there will be a duplication. In my mind the order should be something like this:
- high-level synthesis (toffolis, cliffords, unitaries, etc.) to reduce the circuit to 1 & 2 qubit gates
- layout + routing + post-layout
- other parts including 1-& 2-qubit synthesis, optimization, scheduling. Since these just make local changes on 1 or 2 qubits at a time, they don't alter the mapping.
So I was thinking that this PostLayout
pass can be done in stage 2 (basically to improve the layout). But if the scoring mechanism relies on the gates exactly being in the Target, then it wouldn't work.
For it to work we would need a more relaxed scoring that can approximate. e.g. if there's a 2-qubit unitary it can assign a score to it based on looking at the 2-qubit error rates on that link. It wouldn't be exact, but I think it would be good enough. Since the whole scoring is approximate anyway. (I think soon the devices will report the native cx direction only btw)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give it a try, I'm curious to see the difference between running it at different spots in the pass manager. I'm also wondering if there value in doing post layout > 1 time in a pipeline, like if we did it with looser constraints at the end of 2 and after 3 with the stricter constraints.
FWIW, I did this after 3 because I thought it would be better because we have the complete circuit so we can see how many gates get run on each of the qubits and get the full error rates with each layout. Especially since DenseLayout
is already noise aware so it should be picking similar qubits already. But it's definitely worth testing and checking to see what makes a bigger impact on result quality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been testing this locally running on hardware and most of the time the layouts are the same. But there has been 1 time so far where the layouts were significantly different and the undirected case doing it right after routing was significantly better. So I'm going to adjust the preset pass managers to do it this way in the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Ali Javadi-Abhari <ajavadia@users.noreply.github.com>
It would be nice if we can mirror the design decisions here in Mapomatic. What I think would be nice is to use the ability to select the best sub-graph on multiple backends to target a system if an user does not specify the system name in the call to the primitive (which is possible on the cloud but not IQX at present) |
This commit adds a new flag, strict_direction, which can be used to do an isomorphic match on undirected graphs and use an avg 1q and 2q error rate for each qubit for scoring.
This commit moves the VF2PostLayout run to right after the routing phase in the preset passmanagers. This lets us set the strict_direction flag to False which expands the search space to ignore 2q gate directionality which will potentially find better mappings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
In Qiskit#7862 we recently added a new vf2 post layout pass which is designed to run after routing to improve the layout once we know there is at least one isomorphic subgraph in the coupling graph for the interactions in the circuit. In that PR we still ran vf2 post layout even if vf2 layout pass found a match. That's because the heuristic scoring of layouts used for the vf2 layout and vf2 post layout passes were different. Originally this difference was due to the the vf2 post layout pass being intedended to run after the optimization loop where we could guarantee the gates were in the target and exactly score the error for each potential layout. But since the vf2 post layout was updated to score a layout based on the gate counts for each qubit and the average 1q and 2q instruction error rates we can leverage this better heuristic scoring in the vf2 layout pass. This commit updates the vf2 layout pass to use the same heuristic and deduplicates some of the code between the passes at the same time. Additionally, since the scoring heuristics are the same the preset pass managers are updated to only run vf2 post layout if vf2 layout didn't find a match. If vf2 layout finds a match it's going to be the same as what vf2 post layout finds so there is no need to run the vf2 post layout pass anymore.
* Deduplicate and unify VF2 layout passes In #7862 we recently added a new vf2 post layout pass which is designed to run after routing to improve the layout once we know there is at least one isomorphic subgraph in the coupling graph for the interactions in the circuit. In that PR we still ran vf2 post layout even if vf2 layout pass found a match. That's because the heuristic scoring of layouts used for the vf2 layout and vf2 post layout passes were different. Originally this difference was due to the the vf2 post layout pass being intedended to run after the optimization loop where we could guarantee the gates were in the target and exactly score the error for each potential layout. But since the vf2 post layout was updated to score a layout based on the gate counts for each qubit and the average 1q and 2q instruction error rates we can leverage this better heuristic scoring in the vf2 layout pass. This commit updates the vf2 layout pass to use the same heuristic and deduplicates some of the code between the passes at the same time. Additionally, since the scoring heuristics are the same the preset pass managers are updated to only run vf2 post layout if vf2 layout didn't find a match. If vf2 layout finds a match it's going to be the same as what vf2 post layout finds so there is no need to run the vf2 post layout pass anymore. * Update apply post layout condition comments * Remove old layout score function Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Summary
This commit adds a new transpiler
VF2PostLayout
and adds a newphase/model to the preset transpiler pipeline post-layout/qubit
selection. The idea is based on the mapomatic project [1] which
took the code from the existing
VF2Layout
pass to find an isomorphicsubgraph in the coupling graph after transpilation which had better
noise characteristics than those initial selected as part of the initial
layout phase. Doing post transpile qubit selection provides the pass
more information because we can assume that the circuit's operations are
in the target basis and that there is at least 1 subgraph already in the
coupling graph because we've gone through routing. This enables us to
look at the specific error rates for each instruction and weigh the sum
of error rates for the mapped circuit on each potential qubit mapping to
find the best performing set of qubits for a given circuit. Initial
layout doesn't have access to this information because at the beginning
of the circuit we aren't necessarily going to find a perfect mapping and
we're not guaranteed to be in the target basis. So running post layout
may yield quality improvements even if we found an initial perfect
layout using
VF2Layout
.While this new pass is very similar to
VF2Layout
pass as it builds aninteraction graph representing the 2q interactions in the circuit and
uses retworkx's
vf2_mapping()
function to find all isomorphic subgraphsin the coupling graph it behaves differently. This is a separate pass
because it performs the search a bit differently. First the interaction
graphs are annotated with the gate counts on each qubit and edge which
is used to completely apply a heuristic score to the circuit and secondly
in the case of a target we verify the nodes and edges are feasible in the
subgraph isomorphism check since a target can have operations defined on
a subset of bits. Additionally the scoring heursitic checks the
sum of the error rates for each gate on the mapped qubits.
The preset pass managers are updated to use this new pass at the end of
the transpile and apply a layout if a solution is found.
Details and comments
[1] https://github.com/Qiskit-Partners/mapomatic
TODO:
Target layout score checking key error (looks like mapping or layout generation is wrong, or the node match check is broken)