Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: rewrite some jump tables #30

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

pnwamk
Copy link

@pnwamk pnwamk commented Mar 25, 2020

For some basic blocks Macaw can recognize that the terminal statement
(i.e., the Data.Macaw.Discovery.State.ParsedTermStmt) is a simple
jump table lookup, where one value acts as an index into a sequence
of addresses/targets; these terminal statements are signified by Macaw's
ParsedLookupTable constructor for the ParsedTermStmt type.

With the changes in this commit Renovate now checks for these kinds
of "lookup table jumps" and rewrites them into a series of
comparisons and direct jumps which can then easily be updated to target
rewritten blocks which have been moved by Renovate.

E.g., here is a sequence of instructions that calculates
an index (rax) which is then used to compute a jump address (rcx) and
perform an indirect jump into a known sequence of blocks:

  400125:       48 8b 45 c0             mov    -0x40(%rbp),%rax
  400129:       48 8b 0c c5 80 02 40    mov    0x400280(,%rax,8),%rcx
  400130:       00
  400131:       ff e1                   jmpq   *%rcx

If such a sequence is identified by Macaw as ending in a ParsedLookupTable
(in this case, where rax is an index where the values [0..4] correspond
to known block start addresses), then it can now be rewritten as a sequence
of comparisons and jumps:

  61004b:       48 81 f8 00 00 00 00    cmp    $0x0,%rax
  610052:       0f 84 2c 00 00 00       je     610084 <__renovate_mod_5+0x84>
  610058:       48 81 f8 01 00 00 00    cmp    $0x1,%rax
  61005f:       0f 84 37 00 00 00       je     61009c <__renovate_mod_5+0x9c>
  610065:       48 81 f8 02 00 00 00    cmp    $0x2,%rax
  61006c:       0f 84 42 00 00 00       je     6100b4 <__renovate_mod_5+0xb4>
  610072:       48 81 f8 03 00 00 00    cmp    $0x3,%rax
  610079:       0f 84 4d 00 00 00       je     6100cc <__renovate_mod_5+0xcc>
  61007f:       e9 60 00 00 00          jmpq   6100e4 <__renovate_mod_5+0xe4>

@pnwamk
Copy link
Author

pnwamk commented Mar 25, 2020

Note: the only example I've been able to run this in is the clang examples I added in this PR, e.g. test-switch-jump-table.clang.nostdlib.x86_64.exe

Currently it rewrites it, but the resulting binary has a subtle bug that results in the jumps which return from the jump table targeted blocks to have incorrect addresses. See GaloisInc/flexdis86#18

@pnwamk
Copy link
Author

pnwamk commented Mar 25, 2020

And by "the only example I've been able to run this in" I mean clang was the only compiler which generated output Macaw was able to correctly identify the jump table in -- the others had classification failures for the block terminator which was a jump table indirect jump.

For some basic blocks Macaw can recognize that the terminal statement
(i.e., the Data.Macaw.Discovery.State.ParsedTermStmt) is a simple
jump table lookup, where one value acts as an index into a sequence
of addresses/targets; these terminal statements are signified by Macaw's
 `ParsedLookupTable` constructor for the `ParsedTermStmt` type.

With the changes in this commit Renovate now checks for these kinds
of "lookup table jumps" and rewrites them into a series of
comparisons and direct jumps which can then easily be updated to target
rewritten blocks which have been moved by Renovate.

E.g., here is a sequence of instructions that calculates
an index (rax) which is then used to compute a jump address (rcx) and
perform an indirect jump into a known sequence of blocks:

  400125:       48 8b 45 c0             mov    -0x40(%rbp),%rax
  400129:       48 8b 0c c5 80 02 40    mov    0x400280(,%rax,8),%rcx
  400130:       00
  400131:       ff e1                   jmpq   *%rcx

If such a sequence is identified by Macaw as ending in a `ParsedLookupTable`
(in this case, where `rax` is an index where the values [0..4] correspond
to known block start addresses), then it can now be rewritten as a sequence
of comparisons and jumps:

  61004b:       48 81 f8 00 00 00 00    cmp    $0x0,%rax
  610052:       0f 84 2c 00 00 00       je     610084 <__renovate_mod_5+0x84>
  610058:       48 81 f8 01 00 00 00    cmp    $0x1,%rax
  61005f:       0f 84 37 00 00 00       je     61009c <__renovate_mod_5+0x9c>
  610065:       48 81 f8 02 00 00 00    cmp    $0x2,%rax
  61006c:       0f 84 42 00 00 00       je     6100b4 <__renovate_mod_5+0xb4>
  610072:       48 81 f8 03 00 00 00    cmp    $0x3,%rax
  610079:       0f 84 4d 00 00 00       je     6100cc <__renovate_mod_5+0xcc>
  61007f:       e9 60 00 00 00          jmpq   6100e4 <__renovate_mod_5+0xe4>
@pnwamk pnwamk force-pushed the feature/jump-tables branch from c82d79b to 3c8818b Compare March 25, 2020 18:56
@pnwamk
Copy link
Author

pnwamk commented Mar 25, 2020

FYI, c82d79b to 3c8818b was just typo fixes/rewording in the commit mesage.

@pnwamk pnwamk force-pushed the feature/jump-tables branch from 7537ecc to cc68928 Compare March 26, 2020 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant