Restructuring of the heuristic system #410

EliasLF · 2024-01-02T03:41:57Z

Description

This PR strives to solve multiple problems (that are just too interconnected to keep them to separate PRs):

2 new heuristics GateCountSumDistanceMinusSharedSwaps and GateCountMaxDistanceOrSumDistanceMinusSharedSwaps, where the former is the sum of all qubit pair distances minus an upper bound for how many swaps could potentially be saved by sharing with other moving qubits, and the latter is just the dominating heuristic over the previous GateCountMaxDistance and the new GateCountSumDistanceMinusSharedSwaps.
Tested on a subset of MQTbench and mapping to IBM Brisbane GateCountMaxDistanceOrSumDistanceMinusSharedSwaps reduces the effective branching rate by about 10% compared to GateCountMaxDistance, while of course yielding the same results (if lookahead is disabled) as both heuristics are principally admissible.
Unfortunately, the current default lookahead heuristic GateCountMaxDistance seems to not work well with this new heuristic, as the combination results in slightly higher costs. This is probably due to scaling issues (since the new heuristic is closer to the real cost and therefore larger, reducing the impact of the lookahead penalty) but might be solvable with higher lookahead factors or a new better suited lookahead heuristic.
Introducing a more flexible system for heuristics (both for the main heuristic and the lookahead heuristic). In the new system any implementation specifics of a heuristic are isolated to a single function calculating a heuristic value from a search node, outside of those functions only a few characteristics of the heuristics are relevant for the mapper:
- principal admissibility (i.e. admissibility at least on the optimal solution path)
- tightness (heuristics that are 0 in all goal nodes)
- fidelity-awareness
Fixing the handling of CNOT reversals in Dijkstra, fixed-cost-calculation and all the pre-existing heuristics. The current implementation resulted in both non-admissible heuristics and fixed costs that did not accurately reflect the gates added by QMAP (the cost calculation currently assumes that at most 1 reversal is added, while QMAP actually inserts reversals for each backwards CNOT). This new implementation is as robust as possible, only failing to produce admissible heuristics in edge-cases where cumulative reversal costs on one edge surpass SWAP costs resulting in a non-convex cost space (which generally does not allow for heuristics, that are both admissible and tight)
Moving all methods from HeuristicMapper::Node to HeuristicMapper (since they grew more and more dependent on data structures from HeuristicMapper with recent PRs) and making them more atomic (i.e. reducing points at which nodes are in an inconsistent state)
Added support for semi-directional architectures (i.e. architectures with both bidirectional and unidirectional edges)

List of minor changes and bug fixes:

Cleaning up HeuristicMapper::Node:
- Removing nswaps because of redundancy with swaps.size()
- Making swaps a one-dimensional vector (I assume the inner dimension was originally intended for the possibility of adding multiple swaps per node. However, this is currently not implemented, so all inner vectors are just of length 1)
- renaming done to validMapping, which better describes the property now that non-tight heuristics have been introduced to QMAP
Fixed tracking of HeuristicMapper::Node::validMappedTwoQubitGates and activated it also for non-fidelity-aware heuristics enabling a more efficient check if a node has a valid mapping.
Reduced the register sizes in the example circuits to the actual number of qubits used in each circuit, to allow for easier checks in the tests for the minimum required architecture size
Made the ordering of validly mapped search nodes consistent (as defined by operator>), where previously all validly mapped nodes with the same total cost were considered equal (resulting in an arbitrary ordering in the priority queue and thereby potentially different optimal solutions for different principally admissible heuristics)
fixing a bug in the handling of teleportation qubits in HeuristicMapper::createInitialMapping, that causes this function to go into an endless loop in some edge cases (instead of randomly iterating over the whole coupling map, it only iterates over a part of it due to the upper limit for the RNG being too low, which causes an endless loop once that subgraph is fully mapped but free teleportation qubits remain)
removing redundant Mapper::fidelities
simplifying Dijkstra::buildEdgeSkipTable by internally calling Dijkstra::buildTable for the 0th dimension
moving checks for invalid settings from HeuristicMapper::map to new method HeuristicMapper::checkParameters

Checklist:

The pull request only contains commits that are related to it.
I have added appropriate tests and documentation.
I have made sure that all CI jobs on GitHub pass.
The pull request introduces no new warnings and follows the project's style guidelines.

test/test_heuristic.cpp

burgholzer

Many thanks for another great contribution. What a nice way to start the year!
I really like the direction this PR is taking. Makes the code much more organized.
Most of the comments below are really just comments. Feel free to read them, think about them, and discard them if you don't agree with them.
I think the only point where I might not be that happy is the Architecture class as some redundancy seems to be introduced by the PR. We should be able to clarify this quickly though!

include/Architecture.hpp

include/configuration/Heuristic.hpp

src/Architecture.cpp

src/configuration/Configuration.cpp

src/heuristic/HeuristicMapper.cpp

test/test_heuristic.cpp

EliasLF · 2024-01-13T09:57:04Z

Alright, first of all thank you very much for the quick review!

Some of the design choices in this PR were made with future plans/issues in mind, which are (or were previously) not mentioned in the PR description above. Before going into detail in your code comments, maybe let's first discuss them in general here:

semi-directional architectures (i.e. architectures with both bi- and unidirectional edges)
user-defined gate costs (for the non-fidelity-aware case)
arbitrary 2q gates

If I'm not mistaken, we already talked about all of these in person but it's probably a good idea to also get this into writing here on Github:

The order above is (most likely) also the order of their relevancy (from low to high) and unfortunately also the order of the incurred cost/complexity (from low to high).

To my knowledge currently there exist no semi-directional architectures in practice. However, our heuristic mapper is implemented abstractly enough, that allowing for such architectures comes almost for free (2 extra jump instructions per search node (for purely bi-/unidirectional architectures), a boolean field Architecture::isUnidirectional, a getter method for that field and 1 small else-case in the loop of Architecture::createDistanceTable).
Originally, the plan was to tackle this issue in a future PR, but thanks to your comments I just realized that there were only 2 lines remaining (in the heuristic mapper) that assumed pure directionality. I therefore just pushed these remaining changes (+ a few optimizations) and added this "feature" to the PR description.
If you disagree with the decision to allow semi-directional architectures, it would be quite easy to revert the changes. In that case, however, I think we should at least specifically check and disallow loading such coupling maps (which currently is not the case, they are just not correctly handled during mapping)

User-defined gate costs and arbitrary 2q gates are probably much more relevant, but also more complex, which I only realized recently. As mentioned in the PR description, even the current implementation strictly speaking wrongly assumes a convex cost space due to cumulative reversal costs on logical edges possibly surpassing swap costs (in the non-fidelity-aware case on non-bidirectional architectures). E.g. if there are 9 congruent 2q-gates in a layer, which are already validly mapped to a back-edge incurring 9*4=36 in reversal costs, swapping them to a neighboring forward-edge with 1 swap only costs 34, i.e. the first goal node on the search path is not the optimal goal node.
This is currently not a huge problem, since it's rare to find 9 congruent CNOTs in 1 layer (and even only possible with Disjoint2qBlock layering) and could easily be optimized down to at most 3 CNOTs in the pre-optimization stage. However, with more evened out costs between H and CNOT, or different reversal operations for other 2q gates, this non-convexity might become more significant.
The new distance/dijkstra system (not only fixes the currently flawed, non-admissible system, but also) gives flexibility for future heuristics tackling this non-convexity problem by dropping the tightness constraint. In contrast to the previous custom dijkstra solution, edge-skipping dijkstra will return the correct distance for any combination of swap cost and reversal cost; and 1 execution with 1 skip (as it is currently used in the non-fidelity-aware case) it only increases the runtime complexity from O(E*log V) to O(E*log V + V²*E), which is not too bad for something that runs only once per mapping.

burgholzer · 2024-01-13T10:37:04Z

Alright, first of all thank you very much for the quick review!

Some of the design choices in this PR were made with future plans/issues in mind, which are (or were previously) not mentioned in the PR description above. Before going into detail in your code comments, maybe let's first discuss them in general here:

Thanks for the detailed answer. This cleared up quite a lot. I'll also start here before going in-depth with the comments.

semi-directional architectures (i.e. architectures with both bi- and unidirectional edges)

user-defined gate costs (for the non-fidelity-aware case)

arbitrary 2q gates

If I'm not mistaken, we already talked about all of these in person but it's probably a good idea to also get this into writing here on Github:

The order above is (most likely) also the order of their relevancy (from low to high) and unfortunately also the order of the incurred cost/complexity (from low to high).

To my knowledge currently there exist no semi-directional architectures in practice. However, our heuristic mapper is implemented abstractly enough, that allowing for such architectures comes almost for free (2 extra jump instructions per search node (for purely bi-/unidirectional architectures), a boolean field Architecture::isUnidirectional, a getter method for that field and 1 small else-case in the loop of Architecture::createDistanceTable). Originally, the plan was to tackle this issue in a future PR, but thanks to your comments I just realized that there were only 2 lines remaining (in the heuristic mapper) that assumed pure directionality. I therefore just pushed these remaining changes (+ a few optimizations) and added this "feature" to the PR description. If you disagree with the decision to allow semi-directional architectures, it would be quite easy to revert the changes. In that case, however, I think we should at least specifically check and disallow loading such coupling maps (which currently is not the case, they are just not correctly handled during mapping)

Now that makes a lot more sense! Although such architecture do not exist in practice at the moment, it could very much happen and it almost never hurts to be a little more general. So I am happy with the changes here! Even better to hear, that the review helped in identifying the remaining places that needed changes.

User-defined gate costs and arbitrary 2q gates are probably much more relevant, but also more complex, which I only realized recently.

I have feared as much. Although I still hope that the number of cases that need to be added to handle arbitrary two-qubit gates stays reasonably low. In fact, for bidirectional architectures there shouldn't be too many changes at all. There might be some more optimization potential (e.g., commutation rules involving controlled gates), but we are not taking advantage of that at the moment anyway. The unidirectional case might be trickier, but I think the existing PR already laid a solid foundation for that.

As mentioned in the PR description, even the current implementation strictly speaking wrongly assumes a convex cost space due to cumulative reversal costs on logical edges possibly surpassing swap costs (in the non-fidelity-aware case on non-bidirectional architectures). E.g. if there are 9 congruent 2q-gates in a layer, which are already validly mapped to a back-edge incurring 9_4=36 in reversal costs, swapping them to a neighboring forward-edge with 1 swap only costs 34, i.e. the first goal node on the search path is not the optimal goal node. This is currently not a huge problem, since it's rare to find 9 congruent CNOTs in 1 layer (and even only possible with Disjoint2qBlock layering) and could easily be optimized down to at most 3 CNOTs in the pre-optimization stage. However, with more evened out costs between H and CNOT, or different reversal operations for other 2q gates, this non-convexity might become more significant. The new distance/dijkstra system (not only fixes the currently flawed, non-admissible system, but also) gives flexibility for future heuristics tackling this non-convexity problem by dropping the tightness constraint. In contrast to the previous custom dijkstra solution, edge-skipping dijkstra will return the correct distance for any combination of swap cost and reversal cost; and 1 execution with 1 skip (as it is currently used in the non-fidelity-aware case) it only increases the runtime complexity from O(E_log V) to O(E_log V + V²_E), which is not too bad for something that runs only once per mapping.

I agree that this mostly seems like a theoretical issue for now. Especially since I would guess that the cost difference between single- and two-qubit gates will always stay rather big.
I also agree that the runtime trade-off is definitely worth it. Technically it is not once per mapping, but once per architecture. That information could be pre-computed and re-used, right? At some point we might need a system for such pre-computations (Similar how we did that for the sub-architectures feature; that is unfortunately not as neatly integrated as I would love it to be)

src/configuration/Configuration.cpp

EliasLF · 2024-01-15T07:45:10Z

Technically it is not once per mapping, but once per architecture. That information could be pre-computed and re-used, right? At some point we might need a system for such pre-computations

I agree, pre-computing all the distance values for common architectures (and maybe even a system for saving them to a file for custom architectures) is a great idea and would open the possibility of using metrics that are otherwise too expensive to compute for each mapping process.
For example by computing all possible distances for any reversal cost (which are finitely many, since the cheapest path can only change so many times before skipping the most expensive edge) we could solve the reversal cost problem robustly for the general case.

Too bad, the same is not possible/useful for fidelity distances because of their variability over time. Pre-computing viable swap sharing combinations would solve so much of the gap between the heuristic and true cost there.

burgholzer · 2024-01-15T08:19:19Z

Technically it is not once per mapping, but once per architecture. That information could be pre-computed and re-used, right? At some point we might need a system for such pre-computations

I agree, pre-computing all the distance values for common architectures (and maybe even a system for saving them to a file for custom architectures) is a great idea and would open the possibility of using metrics that are otherwise too expensive to compute for each mapping process.

For example by computing all possible distances for any reversal cost (which are finitely many, since the cheapest path can only change so many times before skipping the most expensive edge) we could solve the reversal cost problem robustly for the general case.

Too bad, the same is not possible/useful for fidelity distances because of their variability over time. Pre-computing viable swap sharing combinations would solve so much of the gap between the heuristic and true cost there.

Although fidelity data varies over time, it all depends on the frequency of calibrations whether it is worth to invest the time and pre-compute these values.
If calibration only happens once every couple hours and computing the tables takes a couple minutes (assuming HPC resources available), it could be worth it.

EliasLF · 2024-01-15T09:36:56Z

Although fidelity data varies over time, it all depends on the frequency of calibrations whether it is worth to invest the time and pre-compute these values.
If calibration only happens once every couple hours and computing the tables takes a couple minutes (assuming HPC resources available), it could be worth it.

Hm, interesting point. I guess if you would map multiple circuits per calibration cycle, that's true, yes. I will keep it as an option in mind then👍

…nto heuristicsSetting

EliasLF added 4 commits December 26, 2023 09:52

first implementation

8603bfd

take max of old and new heuristic

4204205

introduce setting to choose heuristic and lookahead heuristic

f395749

restructure tests

4bfb646

EliasLF added feature New feature or request c++ Anything related to C++ code fix Anything related to bugfixes code quality Anything related to code quality and code style. labels Jan 2, 2024

pre-commit-ci bot and others added 2 commits January 2, 2024 03:42

🎨 pre-commit fixes

230a1d6

Merge branch 'main' into heuristicsSetting

f19a696

github-advanced-security bot found potential problems Jan 2, 2024

View reviewed changes

test/test_heuristic.cpp Fixed Show fixed Hide fixed

test/test_heuristic.cpp Fixed Show fixed Hide fixed

EliasLF and others added 15 commits January 2, 2024 18:48

Merge branch 'main' into heuristicsSetting

93d836d

adding comments to large test functions

2c46590

fix CNOT reversals

c60d736

fix further bugs in heuristic properties test

e030504

move reversals into costFixed for goal nodes

318a700

Merge branch 'main' into heuristicsSetting

fb7eb9e

fix tests after merge

f1a54a5

making local variables const

c4c30d5

fix NodeCostCalculation test

d882ac4

fix earlyTermination test

8824d46

further optimize GateCountSumDistanceMinusSharedSwaps

6ff7b6e

fix calculation of shared swaps

72f4012

fix register sizes in example circuits

63fbf1b

adding documentation

d2106bb

🎨 pre-commit fixes

74f706a

github-advanced-security bot found potential problems Jan 10, 2024

View reviewed changes

test/test_heuristic.cpp Fixed Show fixed Hide fixed

EliasLF added 3 commits January 11, 2024 00:54

fix test_compile

9d56889

expose heuristic enums in python module

a1f0bb7

fixing teleport mapping bug

cc0cd8c

EliasLF and others added 3 commits January 12, 2024 06:24

Merge branch 'main' into heuristicsSetting

c98d899

🎨 pre-commit fixes

0bb568d

fixing case style of optimalSolutions

cd9fa90

EliasLF marked this pull request as ready for review January 12, 2024 06:26

burgholzer added this to the Fidelity-aware Quantum Circuit Mapping milestone Jan 12, 2024

burgholzer requested changes Jan 12, 2024

View reviewed changes

EliasLF and others added 4 commits January 13, 2024 09:56

finish heuristic support for semidirectional architectures

1c28eac

isEdgeBidirectional and considerDirection in isEdgeConnected

477a429

adding comments for isBidirectional and isUnidirectional

3f4c604

🎨 pre-commit fixes

c37145d

EliasLF and others added 6 commits January 15, 2024 06:36

remove redundant code

b6e471d

simplify buildEdgeSkipTable

76e3006

expose heuristic properties in config json

8bcb9ce

checkParameters

2d1e361

ignore bugprone-branch-clone lint warning

c940d49

🎨 pre-commit fixes

e599ab8

github-advanced-security bot found potential problems Jan 15, 2024

View reviewed changes

src/configuration/Configuration.cpp Fixed Show fixed Hide fixed

fix linter warning in Configuration::json

8b7cdf1

use asserts for internal error checking

36b8a8a

EliasLF and others added 5 commits January 15, 2024 10:38

use copy constructor for Node members

08c7588

🎨 pre-commit fixes

ad9cf4a

user ordered sets for activeQubits

08514f2

Merge branch 'heuristicsSetting' of https://github.com/EliasLF/qmap i…

6f6f894

…nto heuristicsSetting

🎨 pre-commit fixes

ade1f54

EliasLF merged commit 63eaf08 into cda-tum:main Jan 16, 2024
34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restructuring of the heuristic system #410

Restructuring of the heuristic system #410

EliasLF commented Jan 2, 2024 •

edited

Loading

burgholzer left a comment

EliasLF commented Jan 13, 2024 •

edited

Loading

burgholzer commented Jan 13, 2024

EliasLF commented Jan 15, 2024

burgholzer commented Jan 15, 2024

EliasLF commented Jan 15, 2024

Restructuring of the heuristic system #410

Restructuring of the heuristic system #410

Conversation

EliasLF commented Jan 2, 2024 • edited Loading

Description

Checklist:

burgholzer left a comment

Choose a reason for hiding this comment

EliasLF commented Jan 13, 2024 • edited Loading

burgholzer commented Jan 13, 2024

EliasLF commented Jan 15, 2024

burgholzer commented Jan 15, 2024

EliasLF commented Jan 15, 2024

EliasLF commented Jan 2, 2024 •

edited

Loading

EliasLF commented Jan 13, 2024 •

edited

Loading