Improve the CFG reconstruction #520
Open
+430
−169
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR drastically improves the way we reconstruct the CFG.
The new algorithm
What this PR changes is the way we find the best candidates for what I call the "switch exits", which are the blocks where the control-flow should rejoin after a switch, and should thus be placed just after the switch when reconstructing the CFG. For instance, in the snippet of code below, the switch exit is
s2
:The "best" switch exit is the block which come earliest in the topological order and where the maximum number of paths rejoin. For instance, consider the following code:
The CFG is as follows:
We note that node where the maximum number of paths rejoins is H0.
The way I compute this is as follows.
We consider the CFG where we removed the backward edges (of the loops): this graph is acyclic. Imagine you take a volume of water equal to 1 and put it at the block where the switch is (e.g.,
A
in the example above), which is the highest point, and this volume of water goes down by following the edges of the graph. Whenever there is a branching, the flow of water is divided equally between the different paths. Whenever two paths join, the flow of water is the addition of the flow of water coming from the different paths. On the graph above, it gives the following (I annotated the nodes with the flow of water, which is a fraction):For our switch exit we then simply pick the node with the highest flow (which is 8/9 here) and which comes earliest in the topological order (that is: H0).
Side remark
I'm pretty sure the algorithm which computes the quantities above has a name, but I don't manage to find it. For instance, this post on StackOverflow is looking for exactly the same thing:
https://stackoverflow.com/questions/78221666/algorithm-for-total-flow-through-weighted-directed-acyclic-graph