-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup constant factors in LookaheadSwap
#8068
Conversation
This picks some of the low-hanging fruit in `LookaheadSwap`, avoiding recalculating various properties and entities that are already known, and making some access patterns more efficient. It does not change the complexity properties of the algorithm, which will still cause its runtime to be excessive for large circuits.
Thank you for opening a new pull request. Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient. While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone. One or more of the the following people are requested to review this:
|
Pull Request Test Coverage Report for Build 2535942293
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one question inline about the _first_op_node()
method before I tag as automerge.
I do think we should port this to rust after #8133 is fixed. This refactor makes it trivial to port as the structure is much closer to how it would be written in rust.
def _first_op_node(dag): | ||
"""Get the first op node from a DAG.""" | ||
# This doesn't use `DAGCircuit.op_nodes` because that function always consumes the entire | ||
# iterator to create a list, whereas we only need the first element. | ||
return next(node for node in dag.nodes() if isinstance(node, DAGOpNode)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading through the code I don't think this matters but this is the first op node based on insertion order not necessarily topological order was that your intent or did you want the first node from a topological ordering of the dag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a direct port of the existing code - it previously did this exact thing but constructed the complete list just to take the first element. I was mostly trying to ensure the iteration order wasn't affect while just trying to speed it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that's what it looked like to me too but I wanted to confirm. The code just read a little weird to me because the "first" node could actually be something in the middle of the dag or at the end as it's dependent on insertion order.
continue | ||
qubits = gate["partition"][0] | ||
if len(qubits) == 2: | ||
out += state.coupling_map.distance(layout_map[qubits[0]], layout_map[qubits[1]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really something we should change here, but I typically avoid the distance()
method and just access the inner distance matrix directly because besides the function call overhead you also have bounds checking which for something like this you know you're layout isn't going to return bounds outside the matrix. You could probably also use numpy's sum to speed this up (although when I've tried that in the past it wasn't really a noticeable difference).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back, I think I was just doing a direct port of the code here - it previously used coupling_map.distance
, just with an iterable unpacking of a generator expression. I unrolled the generator, but left the function call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pass is so slow, and (I think) pretty much completely superseded by SabreSwap, so I'm not 100% sure it's worth rewriting it in Rust. I think there's some extant bugs as well - it was originally randomised testing failures that caused me to look at the pass.
It's not entirely a coincidence the new form looks more Rust-like - I've always preferred pure functions that pass "state" objects around rather than making objects that are naturally pure API-implementation collections (transpiler passes) one-time-use by making them stateful of themselves.
continue | ||
qubits = gate["partition"][0] | ||
if len(qubits) == 2: | ||
out += state.coupling_map.distance(layout_map[qubits[0]], layout_map[qubits[1]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back, I think I was just doing a direct port of the code here - it previously used coupling_map.distance
, just with an iterable unpacking of a generator expression. I unrolled the generator, but left the function call.
def _first_op_node(dag): | ||
"""Get the first op node from a DAG.""" | ||
# This doesn't use `DAGCircuit.op_nodes` because that function always consumes the entire | ||
# iterator to create a list, whereas we only need the first element. | ||
return next(node for node in dag.nodes() if isinstance(node, DAGOpNode)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a direct port of the existing code - it previously did this exact thing but constructed the complete list just to take the first element. I was mostly trying to ensure the iteration order wasn't affect while just trying to speed it up.
Summary
This picks some of the low-hanging fruit in
LookaheadSwap
, avoidingrecalculating various properties and entities that are already known,
and making some access patterns more efficient. It does not change the
complexity properties of the algorithm, which will still cause its
runtime to be excessive for large circuits.
Details and comments
This is worth about a 3x speedup in the pass itself, but it doesn't address any of the scaling issues that cause #5198, for example.
Using a circuit based on the one in #5198:
gave 6.04(6)s before this commit, and 1.69(1)s after.
This pass isn't our preferred routing method any more, but I was looking at it since the randomised tests are spotting failures in it at the moment. This commit won't address the failures; the meaningful logic is unchanged, this commit just does more caching of intermediate values and uses more efficient paths to calculate things. I don't know the cause of the failures right now, but if I find it, I'll make a new PR.
It's possible we could consider dropping this routing pass in favour of one of the other, faster methods if its scaling problems are inherent.