Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes the free-vars-by-dominators computation algorithm #907

Merged

Conversation

ivg
Copy link
Member

@ivg ivg commented Dec 11, 2018

This algorithm kicks in in the optimization-level=3 and ... breaks
everything. The bug was trivial, the kill set was computed before
the live set, hence a block was defining its own variables before it
was used.

This bug affects only 1.5 version with optimization-level=3.

Note, there is still a subtle problem with this algorithm that comes
from the "Everything Dominates Unreachable Block" lemma in the
dominators algorithm. And altough this theorem is valid wrt to the
graph theoretic definition of unreachable, which indeed implies that a
block is not reachable under no circumstances, in the binary analysis
an unreachable block is just a block that might be reachable we just
don't know by which paths. In other words, we have an
under-approximation of a control flow graph, and the unsoundness of
the graph is propagated to the unsoundness of the dominator tree
computation. That lies more or less in the vein with the idea of the
optimization-level=3 which is as as unsound as the underlying graph.

This algorithm kicks in in the optimization-level=3 and ... breaks
everything. The bug was trivial, the kill set was computed before
the live set, hence a block was defining its own variables before it
was used.

This bug affects only 1.5 version with optimization-level=3.

Note, there is still a subtle problem with this algorithm that comes
from the "Everything Dominates Unreachable Block" lemma in the
dominators algorithm. And altough this theorem is valid wrt to the
graph theoretic definition of unreachable, which indeed implies that a
block is not reachable under no circumstances, in the binary analysis
an unreachable block is just a block that might be reachable we just
don't know by which paths. In other words, we have an
under-approximation of a control flow graph, and the unsoundness of
the graph is propagated to the unsoundness of the dominator tree
computation. That lies more or less in the vein with the idea of the
optimization-level=3 which is as as unsound as the underlying graph.
@ivg ivg requested a review from gitoleg December 11, 2018 15:49
ivg added a commit to ivg/bap that referenced this pull request Dec 11, 2018
note: this PR shall be merged after BinaryAnalysisPlatform#907 is merged
@gitoleg gitoleg merged commit 6ac0793 into BinaryAnalysisPlatform:master Dec 11, 2018
gitoleg pushed a commit that referenced this pull request Dec 11, 2018
* maked dead code elimination less conservative

As was noted in #905 the optimization passes sometimes misses an
opportunity to remove obvious assignments. The underlying reason
is that we compute a very conservative set of variables which are
used for interprocedural communication. And if a variable is a
member of this set then we do not delete it (under assumption, that
it could be used in other subroutines).

What was overconservative is that all versions of the same variable
were protected, while only the last definition one could be used for
interprocedural communication. The propsed change relaxes the
protected set, and removes only the last version of a variable from
the set of the dead variables.

Fixes #905

* makes it sound

Unfortunately, the original proposal was unsound, as the assumption
that only the last definition should be preserved is implied by the
assumption that only the last definition is used for interprocedural
communication, which is totally wrong.

The liveness property is propagated from each call (and indirect jump)
to all members of the protected set. We're using the SSA form for
def-use analysis which is inherently intraprocedural, so to preserve
the liveness of the variables with the interprocedural scope we use
the protected set as an overapproximation.  A tighter approximation
would be to reinstantiate their liveness after each call or indirect
jump. However, this would require tampering with the ssa algorithm,
that we don't want to do right now.

The updated solution reinstantiates liveness of protected variables
at the end of each block, even the block is not a call or even jump.
The liveness is killed by the first use of protection, so it is
propagated only to the last (if any) definition.

* removes a dead function

note: this PR shall be merged after #907 is merged

* drops debug info

* drops a change that is irrelevant to the proposal

to make it cleaner
@ivg ivg deleted the fixes-sub-free-vars-by-doms branch December 1, 2021 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants