Improving memory consumption of DagDependency when lists of transitive predecessors and successors are not required #8525

alexanderivrii · 2022-08-11T15:14:15Z

Summary

This is a follow-up on PR #8184. DAGDependency is a canonical representation of non-commutativity in a circuit. Currently, when a DAGDependency is constructed, for every node we automatically construct the lists of all of its transitive predecessors and successors. This makes the construction slow and especially highly memory-intensive for circuits with many gates. On the other hand, these lists of predecessors/successors are not needed for many applications of DagDependency (see PR #8319).

This PR modifies the code for constructing DAGDependency, so that only direct (immediate) predecessors of a node are considered, and introduces an additional option create_preds_and_succs to converters circuit_to_dagdependency and dag_to_dagdependency to avoid constructing transitive predecessors/successors when the value of this option is False. This improves the runtime and memory consumption of DAGDependency constructions.

Details and comments

Some experimental results (compared with the results in #8184).

Constructing DAGDependency (without predecessors/successors):

Before:

DagDependency: #qubits = 5, #gates = 100, seed = 0, #edges = 151, depth = 30, time = 0.0439
DagDependency: #qubits = 5, #gates = 1000, seed = 0, #edges = 1618, depth = 336, time = 1.0921
DagDependency: #qubits = 5, #gates = 5000, seed = 0, #edges = 8007, depth = 1650, time = 19.8801
DagDependency: #qubits = 10, #gates = 100, seed = 0, #edges = 162, depth = 22, time = 0.0469
DagDependency: #qubits = 10, #gates = 1000, seed = 0, #edges = 1672, depth = 200, time = 1.2437
DagDependency: #qubits = 10, #gates = 5000, seed = 0, #edges = 8333, depth = 943, time = 22.9258

New:

#qubits = 5, #gates =  100: size = 100, #edges = 151, depth = 30, time = 0.0389
#qubits = 5, #gates =  1000: size = 1000, #edges = 1618, depth = 336, time = 0.5190
#qubits = 5, #gates =  5000: size = 5000, #edges = 8007, depth = 1650, time = 10.3534
#qubits = 10, #gates =  100: size = 100, #edges = 162, depth = 22, time = 0.0399
#qubits = 10, #gates =  1000: size = 1000, #edges = 1672, depth = 200, time = 0.5398
#qubits = 10, #gates =  5000: size = 5000, #edges = 8333, depth = 943, time = 10.3706

The DAGDependency statistics (number of nodes, number of edges, depth of the circuit) are of course unchanged, but the times are improved 2x.

Memory consumption (without predecessors/successors)
Before:

#qubits = 5, #gates = 100, memory = 0.3 MB
#qubits = 5, #gates = 1000, memory = 9.4 MB
#qubits = 5, #gates = 5000, memory = 209.3 MB
#qubits = 10, #gates = 100, memory = 0.1 MB
#qubits = 10, #gates = 1000, memory = 7.8 MB
#qubits = 10, #gates = 5000, memory = 199.1 MB

New:

#qubits = 5, #gates = 100, memory = 0.4 MB
#qubits = 5, #gates = 1000, memory = 1.7 MB
#qubits = 5, #gates = 5000, memory = 5.7 MB
#qubits = 10, #gates = 100, memory = 0.0 MB
#qubits = 10, #gates = 1000, memory = 0.0 MB
#qubits = 10, #gates = 5000, memory = 2.8 MB

…ssors and predecessors are not needed

qiskit-bot · 2022-08-11T15:14:18Z

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the the following people are requested to review this:

@Qiskit/terra-core

coveralls · 2022-08-14T06:50:41Z

Pull Request Test Coverage Report for Build 3130817582

26 of 26 (100.0%) changed or added relevant lines in 3 files are covered.
5 unchanged lines in 2 files lost coverage.
Overall coverage decreased (-0.004%) to 84.413%

Files with Coverage Reduction	New Missed Lines	%
qiskit/extensions/quantum_initializer/squ.py	2	79.78%
qiskit/dagcircuit/dagdependency.py	3	86.73%

Totals
Change from base Build 3130815002:	-0.004%
Covered Lines:	59603
Relevant Lines:	70609

💛 - Coveralls

mtreinish

This looks like a fine incremental improvement (except for one potential bug in line), but my concern which may be larger than this PR is that DAGDependency doesn't seem to leverage any of the graph data structure to analyze structural aspects of the graph. In other words we shouldn't need to store in the node it's list of adjacent nodes because the underlying graph object makes that quick to lookup when needed.

So I guess my higher level question about this is instead of just reducing the set of what is stored in the node why can't we just remove that node.successors and node.predecessors completely because the data is duplicated with what we're building in the graph already.

mtreinish · 2022-08-23T16:38:42Z

qiskit/dagcircuit/dagdependency.py

@@ -454,6 +454,7 @@ def _gather_succ(self, node_id, direct_succ):
            the lists of direct successors are put into a single one
        """
        gather = self._multi_graph
+        gather.get_node_data(node_id).successors = []


I'm still lost on why this is needed? Storing attributes of graph structural properties like this seem weird to me when we're building this all around a graph data structure. Like if you want the list of successor nodes from a given node in the graph this is just self._multi_graph.successors(node_id): https://qiskit.org/documentation/retworkx/apiref/retworkx.PyDiGraph.successors.html#retworkx.PyDiGraph.successors. There is also a version with a filter function built-in if needed: https://qiskit.org/documentation/retworkx/apiref/retworkx.PyDiGraph.find_successors_by_edge.html#retworkx.PyDiGraph.find_successors_by_edge

If you need to full descendants set you can use rx.descendants() (https://qiskit.org/documentation/retworkx/apiref/retworkx.descendants.html) caching any of these graph relationships at the node level seems a bit unnecessary when we have the graph data structure already and can query it on demand. Unless there is something I'm missing here.

I am guessing that DagDependency was originally created to be primarily used in TemplateOptimization pass. For instance, look at the fields of DAGDepNode:

__slots__ = ["type", "_op", "name", "_qargs", "cargs", "sort_key", "node_id", "successors", "predecessors", "reachable", matchedwith", "isblocked", "successorstovisit", "qindices", "cindices"]

many of these only make sense in the context of TemplateOptimization.

For caching vs. not caching lists of direct and transitive predecessors and successors, there is likely some tradeoff between performance and memory consumption, depending on how many times TemplateOptimization happens to call these. I haven't really experimented with TemplateOptimization, so I don't know, but for this PR I don't want to risk making TemplateOptimization even slower than it is now, even though I completely agree that it probably makes perfect sense to use self._multi_graph.successors instead of self.get_node_data(node_id).successors. I do want this PR, as an enabler for block collection strategies that take commutativity into account (#8319), as these are currently slow and extremely memory-hungry to be applied to large circuits.

I do think it would be a good idea to refactor DagDepNode, DagDependency and TemplateOptimization pass.

Would you mine putting a TODO comment somewhere expressing the desire to remove the this caching in the dag nodes then. I think I was letting my discontent with how this structure is currently built cloud my review here. If we just have an inline comment somewhere saying this refactor is on the table in the future that'll be enough for me.

I have added the "TODO" comment. Just to be clear, I think that we should definitely refactor DAGDependency, however a lot of it might go hand-in-hand with doing changes to TempateOptimization and I did not want to touch this in the current PR.

mtreinish · 2022-08-23T16:43:03Z

qiskit/dagcircuit/dagdependency.py

+        Updates DagDependency by adding edges to the newly added node (max_node)
+        from the previously added nodes.
+        For each previously added node (prev_node), an edge from prev_node to max_node
+        is added if max_node is "reachable" from prev_node (this means that the two
+        nodes can be made adjacent by commuting them with other nodes), but the two nodes
+        themselves do not commute.
        """
        max_node_id = len(self._multi_graph) - 1
-        max_node = self._multi_graph.get_node_data(max_node_id)
+        max_node = self.get_node(max_node_id)

-        for current_node_id in range(0, max_node_id):
-            self._multi_graph.get_node_data(current_node_id).reachable = True
-        # Check the commutation relation with reachable node, it adds edges if it does not commute
+        reachable = [True] * max_node_id
+
+        # Analyze nodes in the reverse topological order.
+        # An improvement to the original algorithm is to consider only direct predecessors
+        # and to avoid constructing the lists of forward and backward reachable predecessors
+        # for every node when not required.
        for prev_node_id in range(max_node_id - 1, -1, -1):


This code won't necessarily work if there are any removals in the graph. The list of node ids in a rustworkx graph might have holes if there are intermediate removals (see the first code example on https://qiskit.org/documentation/retworkx/apiref/retworkx.PyDiGraph.html#retworkx.PyDiGraph or https://qiskit.org/documentation/retworkx/tutorial/introduction.html#removing-elements-from-a-graph) you might end up trying to use an index not in the graph by doing a reverse range like this. It's better to rely on graph.node_indices() to get a sequence of node indices in the graph and iterate over that in the order required.

Thanks! I completely agree, but note that I have not changed how nodes in DagDependency are iterated, the only real change is to consider direct predecessors instead of transitive predecessors, and to avoid storing transitive predecessors if not required (and the same for successors). This whole function _update_edges is very strange and (imho) can only be used when constructing DagDependency from QuantumCircuit or from DagCircuit (which is the case right now), where we don't need to worry about holes in the graph.

Ok, do you mind leaving a comment here explaining the context around this because every time I read this code I'll have the same thought. So basically just say inline this context doesn't have removals since it's only called when creating a new dag dependency from another representation of a circuit.

I have added this comment to the doc string of _update_edges, and added another TODO to rethink DAGDependency API. At the moment, it's not clear to me whether there are potential usages of DAGDependency that will require to call _update_edges except when DAGDependency is constructed from another representation of a circuit.

alexanderivrii · 2022-09-12T13:48:50Z

@mtreinish, thanks for the review! I presume that

except for one potential bug in line

refers to iterating over nodes without taking removals into account, but (as I commented above) this code is currently only used when DagDependency is originally constructed.

My goal is to continue with collecting blocks of gates taking gate commutativity into account, and this PR is needed to make block collection faster and significantly less memory-hungry for large circuits.

I do think that it would be a good idea to rethink and refactor both DagDepNode and TemplateOptimization code to make better use of rustworkx capabilities.

Is there anything specific that I should do for this PR?

mtreinish

I think given that you're saying that my review comments were just temporary conditions and you're leaving the current interface (and potential bugs) as is and concentrating on just memory overhead/performance this PR looks fine to me. I think we should just put some comments inline explaining this and then lets move forward on this.

mtreinish · 2022-09-22T14:01:02Z

qiskit/dagcircuit/dagdependency.py

+        Updates DagDependency by adding edges to the newly added node (max_node)
+        from the previously added nodes.
+        For each previously added node (prev_node), an edge from prev_node to max_node
+        is added if max_node is "reachable" from prev_node (this means that the two
+        nodes can be made adjacent by commuting them with other nodes), but the two nodes
+        themselves do not commute.
        """
        max_node_id = len(self._multi_graph) - 1
-        max_node = self._multi_graph.get_node_data(max_node_id)
+        max_node = self.get_node(max_node_id)

-        for current_node_id in range(0, max_node_id):
-            self._multi_graph.get_node_data(current_node_id).reachable = True
-        # Check the commutation relation with reachable node, it adds edges if it does not commute
+        reachable = [True] * max_node_id
+
+        # Analyze nodes in the reverse topological order.
+        # An improvement to the original algorithm is to consider only direct predecessors
+        # and to avoid constructing the lists of forward and backward reachable predecessors
+        # for every node when not required.
        for prev_node_id in range(max_node_id - 1, -1, -1):


Ok, do you mind leaving a comment here explaining the context around this because every time I read this code I'll have the same thought. So basically just say inline this context doesn't have removals since it's only called when creating a new dag dependency from another representation of a circuit.

mtreinish · 2022-09-22T14:02:49Z

qiskit/dagcircuit/dagdependency.py

@@ -454,6 +454,7 @@ def _gather_succ(self, node_id, direct_succ):
            the lists of direct successors are put into a single one
        """
        gather = self._multi_graph
+        gather.get_node_data(node_id).successors = []


Would you mine putting a TODO comment somewhere expressing the desire to remove the this caching in the dag nodes then. I think I was letting my discontent with how this structure is currently built cloud my review here. If we just have an inline comment somewhere saying this refactor is on the table in the future that'll be enough for me.

alexanderivrii · 2022-09-25T09:04:40Z

Hmm, building docs started to fail after merging the main branch. Is this an instance of "things may go wrong on week-ends"?

mtreinish · 2022-09-26T15:55:47Z

Hmm, building docs started to fail after merging the main branch. Is this an instance of "things may go wrong on week-ends"?

There was a new sphinx release over the weekend that broke the docs build, they pushed out 5.2.1 yesterday evening to fix it

mtreinish

LGTM now, with a small typo in the release. Thanks for adding the comments I think that makes it clear this is just an incremental step and in a coming future PR we'll have a larger refactor of the class and template optimization

releasenotes/notes/dag_dependency_speedup-f6298348cb3d8746.yaml

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

Improved construction of DagDependency when lists of transitive succe…

a808d8e

…ssors and predecessors are not needed

alexanderivrii requested a review from a team as a code owner August 11, 2022 15:14

alexanderivrii added 3 commits August 11, 2022 18:31

typos

22f5f11

minor

a11c414

Merge branch 'main' into dag_dependency_speedup

d238a29

Merge branch 'main' into dag_dependency_speedup

19c4354

alexanderivrii added the enhancement label Aug 14, 2022

ShellyGarion assigned mtreinish Aug 22, 2022

mtreinish reviewed Aug 23, 2022

View reviewed changes

Merge branch 'main' into dag_dependency_speedup

694f766

alexanderivrii changed the title ~~Improved construction of DagDependency~~ Improving memory consumption of DagDependency when lists of transitive predecessors and successors are not required Sep 12, 2022

alexanderivrii requested a review from mtreinish September 12, 2022 13:49

alexanderivrii added this to the 0.22 milestone Sep 14, 2022

mtreinish added the Changelog: New Feature Include in the "Added" section of the changelog label Sep 22, 2022

mtreinish reviewed Sep 22, 2022

View reviewed changes

mtreinish added the performance label Sep 22, 2022

alexanderivrii added 2 commits September 25, 2022 10:20

Merge branch 'main' into dag_dependency_speedup

5fbb419

Adding TODO and explanation messages following review comments

339e183

Merge branch 'main' into dag_dependency_speedup

e39dfa9

mtreinish previously approved these changes Sep 26, 2022

View reviewed changes

releasenotes/notes/dag_dependency_speedup-f6298348cb3d8746.yaml Outdated Show resolved Hide resolved

alexanderivrii dismissed mtreinish’s stale review via 895d6bc September 26, 2022 16:39

Update releasenotes/notes/dag_dependency_speedup-f6298348cb3d8746.yaml

895d6bc

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

mtreinish approved these changes Sep 26, 2022

View reviewed changes

mtreinish added the automerge label Sep 26, 2022

Merge branch 'main' into dag_dependency_speedup

c601902

mergify bot merged commit 2f6ea4b into Qiskit:main Sep 26, 2022

alexanderivrii mentioned this pull request Feb 7, 2023

Bug fix in template optimization, and minor cleanup #9541

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving memory consumption of DagDependency when lists of transitive predecessors and successors are not required #8525

Improving memory consumption of DagDependency when lists of transitive predecessors and successors are not required #8525

alexanderivrii commented Aug 11, 2022

qiskit-bot commented Aug 11, 2022

coveralls commented Aug 14, 2022 •

edited

Loading

mtreinish left a comment

mtreinish Aug 23, 2022

alexanderivrii Sep 12, 2022

mtreinish Sep 22, 2022

alexanderivrii Sep 25, 2022

mtreinish Aug 23, 2022

alexanderivrii Sep 12, 2022

mtreinish Sep 22, 2022

alexanderivrii Sep 25, 2022

alexanderivrii commented Sep 12, 2022

mtreinish left a comment

mtreinish Sep 22, 2022

mtreinish Sep 22, 2022

alexanderivrii commented Sep 25, 2022

mtreinish commented Sep 26, 2022

mtreinish left a comment

Improving memory consumption of DagDependency when lists of transitive predecessors and successors are not required #8525

Improving memory consumption of DagDependency when lists of transitive predecessors and successors are not required #8525

Conversation

alexanderivrii commented Aug 11, 2022

Summary

Details and comments

qiskit-bot commented Aug 11, 2022

coveralls commented Aug 14, 2022 • edited Loading

Pull Request Test Coverage Report for Build 3130817582

💛 - Coveralls

mtreinish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexanderivrii commented Sep 12, 2022

mtreinish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexanderivrii commented Sep 25, 2022

mtreinish commented Sep 26, 2022

mtreinish left a comment

Choose a reason for hiding this comment

coveralls commented Aug 14, 2022 •

edited

Loading