-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving memory consumption of DagDependency when lists of transitive predecessors and successors are not required #8525
Conversation
…ssors and predecessors are not needed
Thank you for opening a new pull request. Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient. While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone. One or more of the the following people are requested to review this:
|
Pull Request Test Coverage Report for Build 3130817582
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a fine incremental improvement (except for one potential bug in line), but my concern which may be larger than this PR is that DAGDependency
doesn't seem to leverage any of the graph data structure to analyze structural aspects of the graph. In other words we shouldn't need to store in the node it's list of adjacent nodes because the underlying graph object makes that quick to lookup when needed.
So I guess my higher level question about this is instead of just reducing the set of what is stored in the node why can't we just remove that node.successors
and node.predecessors
completely because the data is duplicated with what we're building in the graph already.
@@ -454,6 +454,7 @@ def _gather_succ(self, node_id, direct_succ): | |||
the lists of direct successors are put into a single one | |||
""" | |||
gather = self._multi_graph | |||
gather.get_node_data(node_id).successors = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still lost on why this is needed? Storing attributes of graph structural properties like this seem weird to me when we're building this all around a graph data structure. Like if you want the list of successor nodes from a given node in the graph this is just self._multi_graph.successors(node_id)
: https://qiskit.org/documentation/retworkx/apiref/retworkx.PyDiGraph.successors.html#retworkx.PyDiGraph.successors. There is also a version with a filter function built-in if needed: https://qiskit.org/documentation/retworkx/apiref/retworkx.PyDiGraph.find_successors_by_edge.html#retworkx.PyDiGraph.find_successors_by_edge
If you need to full descendants set you can use rx.descendants()
(https://qiskit.org/documentation/retworkx/apiref/retworkx.descendants.html) caching any of these graph relationships at the node level seems a bit unnecessary when we have the graph data structure already and can query it on demand. Unless there is something I'm missing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing that DagDependency
was originally created to be primarily used in TemplateOptimization
pass. For instance, look at the fields of DAGDepNode
:
__slots__ = ["type", "_op", "name", "_qargs", "cargs", "sort_key", "node_id", "successors", "predecessors", "reachable", matchedwith", "isblocked", "successorstovisit", "qindices", "cindices"]
many of these only make sense in the context of TemplateOptimization
.
For caching vs. not caching lists of direct and transitive predecessors and successors, there is likely some tradeoff between performance and memory consumption, depending on how many times TemplateOptimization
happens to call these. I haven't really experimented with TemplateOptimization
, so I don't know, but for this PR I don't want to risk making TemplateOptimization
even slower than it is now, even though I completely agree that it probably makes perfect sense to use self._multi_graph.successors
instead of self.get_node_data(node_id).successors
. I do want this PR, as an enabler for block collection strategies that take commutativity into account (#8319), as these are currently slow and extremely memory-hungry to be applied to large circuits.
I do think it would be a good idea to refactor DagDepNode
, DagDependency
and TemplateOptimization
pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mine putting a TODO comment somewhere expressing the desire to remove the this caching in the dag nodes then. I think I was letting my discontent with how this structure is currently built cloud my review here. If we just have an inline comment somewhere saying this refactor is on the table in the future that'll be enough for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the "TODO" comment. Just to be clear, I think that we should definitely refactor DAGDependency
, however a lot of it might go hand-in-hand with doing changes to TempateOptimization
and I did not want to touch this in the current PR.
qiskit/dagcircuit/dagdependency.py
Outdated
Updates DagDependency by adding edges to the newly added node (max_node) | ||
from the previously added nodes. | ||
For each previously added node (prev_node), an edge from prev_node to max_node | ||
is added if max_node is "reachable" from prev_node (this means that the two | ||
nodes can be made adjacent by commuting them with other nodes), but the two nodes | ||
themselves do not commute. | ||
""" | ||
max_node_id = len(self._multi_graph) - 1 | ||
max_node = self._multi_graph.get_node_data(max_node_id) | ||
max_node = self.get_node(max_node_id) | ||
|
||
for current_node_id in range(0, max_node_id): | ||
self._multi_graph.get_node_data(current_node_id).reachable = True | ||
# Check the commutation relation with reachable node, it adds edges if it does not commute | ||
reachable = [True] * max_node_id | ||
|
||
# Analyze nodes in the reverse topological order. | ||
# An improvement to the original algorithm is to consider only direct predecessors | ||
# and to avoid constructing the lists of forward and backward reachable predecessors | ||
# for every node when not required. | ||
for prev_node_id in range(max_node_id - 1, -1, -1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code won't necessarily work if there are any removals in the graph. The list of node ids in a rustworkx graph might have holes if there are intermediate removals (see the first code example on https://qiskit.org/documentation/retworkx/apiref/retworkx.PyDiGraph.html#retworkx.PyDiGraph or https://qiskit.org/documentation/retworkx/tutorial/introduction.html#removing-elements-from-a-graph) you might end up trying to use an index not in the graph by doing a reverse range like this. It's better to rely on graph.node_indices()
to get a sequence of node indices in the graph and iterate over that in the order required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I completely agree, but note that I have not changed how nodes in DagDependency
are iterated, the only real change is to consider direct predecessors instead of transitive predecessors, and to avoid storing transitive predecessors if not required (and the same for successors). This whole function _update_edges
is very strange and (imho) can only be used when constructing DagDependency
from QuantumCircuit
or from DagCircuit
(which is the case right now), where we don't need to worry about holes in the graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, do you mind leaving a comment here explaining the context around this because every time I read this code I'll have the same thought. So basically just say inline this context doesn't have removals since it's only called when creating a new dag dependency from another representation of a circuit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this comment to the doc string of _update_edges
, and added another TODO to rethink DAGDependency
API. At the moment, it's not clear to me whether there are potential usages of DAGDependency
that will require to call _update_edges
except when DAGDependency
is constructed from another representation of a circuit.
@mtreinish, thanks for the review! I presume that
refers to iterating over nodes without taking removals into account, but (as I commented above) this code is currently only used when My goal is to continue with collecting blocks of gates taking gate commutativity into account, and this PR is needed to make block collection faster and significantly less memory-hungry for large circuits. I do think that it would be a good idea to rethink and refactor both Is there anything specific that I should do for this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think given that you're saying that my review comments were just temporary conditions and you're leaving the current interface (and potential bugs) as is and concentrating on just memory overhead/performance this PR looks fine to me. I think we should just put some comments inline explaining this and then lets move forward on this.
qiskit/dagcircuit/dagdependency.py
Outdated
Updates DagDependency by adding edges to the newly added node (max_node) | ||
from the previously added nodes. | ||
For each previously added node (prev_node), an edge from prev_node to max_node | ||
is added if max_node is "reachable" from prev_node (this means that the two | ||
nodes can be made adjacent by commuting them with other nodes), but the two nodes | ||
themselves do not commute. | ||
""" | ||
max_node_id = len(self._multi_graph) - 1 | ||
max_node = self._multi_graph.get_node_data(max_node_id) | ||
max_node = self.get_node(max_node_id) | ||
|
||
for current_node_id in range(0, max_node_id): | ||
self._multi_graph.get_node_data(current_node_id).reachable = True | ||
# Check the commutation relation with reachable node, it adds edges if it does not commute | ||
reachable = [True] * max_node_id | ||
|
||
# Analyze nodes in the reverse topological order. | ||
# An improvement to the original algorithm is to consider only direct predecessors | ||
# and to avoid constructing the lists of forward and backward reachable predecessors | ||
# for every node when not required. | ||
for prev_node_id in range(max_node_id - 1, -1, -1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, do you mind leaving a comment here explaining the context around this because every time I read this code I'll have the same thought. So basically just say inline this context doesn't have removals since it's only called when creating a new dag dependency from another representation of a circuit.
@@ -454,6 +454,7 @@ def _gather_succ(self, node_id, direct_succ): | |||
the lists of direct successors are put into a single one | |||
""" | |||
gather = self._multi_graph | |||
gather.get_node_data(node_id).successors = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mine putting a TODO comment somewhere expressing the desire to remove the this caching in the dag nodes then. I think I was letting my discontent with how this structure is currently built cloud my review here. If we just have an inline comment somewhere saying this refactor is on the table in the future that'll be enough for me.
Hmm, building docs started to fail after merging the main branch. Is this an instance of "things may go wrong on week-ends"? |
There was a new sphinx release over the weekend that broke the docs build, they pushed out 5.2.1 yesterday evening to fix it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now, with a small typo in the release. Thanks for adding the comments I think that makes it clear this is just an incremental step and in a coming future PR we'll have a larger refactor of the class and template optimization
releasenotes/notes/dag_dependency_speedup-f6298348cb3d8746.yaml
Outdated
Show resolved
Hide resolved
Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
Summary
This is a follow-up on PR #8184.
DAGDependency
is a canonical representation of non-commutativity in a circuit. Currently, when aDAGDependency
is constructed, for every node we automatically construct the lists of all of its transitive predecessors and successors. This makes the construction slow and especially highly memory-intensive for circuits with many gates. On the other hand, these lists of predecessors/successors are not needed for many applications ofDagDependency
(see PR #8319).This PR modifies the code for constructing
DAGDependency
, so that only direct (immediate) predecessors of a node are considered, and introduces an additional optioncreate_preds_and_succs
to converterscircuit_to_dagdependency
anddag_to_dagdependency
to avoid constructing transitive predecessors/successors when the value of this option is False. This improves the runtime and memory consumption ofDAGDependency
constructions.Details and comments
Some experimental results (compared with the results in #8184).
Constructing DAGDependency (without predecessors/successors):
Before:
New:
The DAGDependency statistics (number of nodes, number of edges, depth of the circuit) are of course unchanged, but the times are improved 2x.
Memory consumption (without predecessors/successors)
Before:
New: