-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
preserves brancher information in the BIL code of an instruction #914
preserves brancher information in the BIL code of an instruction #914
Conversation
lib/bap_disasm/bap_disasm_rec.ml
Outdated
match stage1.lift mem ins with | ||
| Ok bil -> | ||
let bil = | ||
if has_jump bil then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be required for an instruction to have a jump, so we should remove this check and ensure that the BIL code always contains the sum of the destinations returned by the lifter and by the brancher.
2760b17
to
26f5a01
Compare
ce6e6c2
to
cfb49e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good work, but can you, please, simplify the code generated in the non-deterministic case?
lib/bap_disasm/bap_disasm_rec.ml
Outdated
@@ -299,6 +299,46 @@ let create_indexes (dests : dests Addr.Table.t) = | |||
|
|||
let filter_valid s = {s with inits = Set.inter s.inits s.valid} | |||
|
|||
let join_destinations dests = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can stick to a more simple spraying destination in case of the nondeterminism and adopt the following scheme:
if (unk) jmp d1;
...
if (unk) jmp dm;
jmp d;
where d
is the original BIL destination and d1,dm
is the set of destinations provided by the brancher (alternatively you can just pick an arbitrary d
from the whole set, it doesn't really matter).
This code will produce graphs that are simple (ideally, it should map to a single block with multiple when
guarded jumps), while your version requires some sophistication during graph generation and with naive approach will spill a graph that is quadratic with respect to the number of destinations (which in case if we will retarget VSA here, could be a very big number).
Introduction
During the program reconstruction we rely on several sources of information, such as lifter and brancher. However, the information that we obtained from the brancher is not stored that leads to a mismatch later. This PR preserves the information provided by the brancher in the BIL code of an instruction. Ideally, the brancher should operate on the BIL level and fix the BIL itself, but we have what we have.
Design
We have two sources of information,
D
the set of destinations from the lifter andD'
the set of destinations form the brancher, which are equally credible for us, but may provide conflicting information. Our task is to generate the BIL code that will have a set of destinationsR
which preserves the existing destinations (completeness) and do not introduce any false destinations (soundness), or formallyR = D + D'
. This requirement is not very strong, as it doesn't quantify over branch predicates, so this transformation might introduce extra (and sometimes unnecessary) non-determinism. However, we're trying to limit it as much as possible without making the code to complicate, in particular, we're ensuring that ifD'
is a subset ofD
(i.e., there is no new information from the brancher) then we don't touch the code.Implementation
If we have conflicting information between the brancher and the lifter, we resolve it using a conditional branch predicated with the
unknown
expression,If there is no conflict, for example, when lifter tells us to jump to an indirect destination, while brancher gives some concrete destinations
[d1,d2,...,dm]
, we spill the following code:References
Resolves: #899
Previous attempt: #911