Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ARM decoder/encoder internal and external tests #1686

Closed
derekbruening opened this issue Apr 15, 2015 · 2 comments · Fixed by #5163
Closed

add ARM decoder/encoder internal and external tests #1686

derekbruening opened this issue Apr 15, 2015 · 2 comments · Fixed by #5163

Comments

@derekbruening
Copy link
Contributor

Split from #1551 which we are considering complete. We have not yet ported the internal consistency tests (e.g., api.ir) nor the external checks vs other decoders (e.g., api.dis). This issue will also serve as a container case for small bug fixes found in the course of adding these tests.

@derekbruening
Copy link
Contributor Author

Recording some capstone bugs (beyond corner-case areas in #1685) for reference for anyone else using it as a comparison:

+0x0540   00000000   and.eq %r0 %r0  -> %r0 
vs
0x00000540:  00000000   <invalid: errcode 0>
*** DONE OP_vtbl, OP_vld, OP_vst past d31 not considered invalid by capstone == capstone bug
    CLOSED: [2015-04-16 Thu 10:33]
    - State "DONE"       from "TODO"       [2015-04-16 Thu 10:33]

You can see capstone with an overflow here likely beyond the dNN string
name array into other register names:

0x00011c5a:  ffbf fa8d   vtbl.8 d15, {d31, fpinst2, mvfr0}, d13 
    0x00011c5a:   ffbf fa8d  <INVALID>

if n+length > 32 then UNPREDICTABLE;


Similarly (look at the wraparound here):
0x0003e7b0:  f940 a159   vst4eq.16  {d26, d28, d30, d0}, [r0:0x40], r9 
    0x0003e7b0:   f940 a159  <INVALID>

if n == 15 || d4 > 31 then UNPREDICTABLE;

Similar:
0x000591ca:  f967 f100   vld4.8 {d31, d1, d3, d5}, [r7], r0 
    0x000591ca:   f967 f100  <INVALID>

0x0005e358:  f964 f235   vld1.8 {d31, fpinst2, mvfr0, mvfr1}, [r4:0x100], r5 
    0x0005e358:   f964 f235  <INVALID>

0x0006b210:  f4efde21   vld3.8  {d29[], d31[], d1[]}, [pc], r1 
    +0x6b210:   f4efde21   <INVALID>
*** INFO capstone bug: 0xcd00 incorrectly considered 1st half of 32-bit instr

capstone says:
  0x00007cda:  cd00 4a71   vstr s8, [r0, #-0x1c4] 

DR:
    0x00007cda:   cd00       ldm    (%r5) %r5 -> %r5 
    0x00007cdc:   4a71       ldr    +0x000001c4(%pc)[4byte] -> %r2 

# echo ' 0x00 0xcd ' | /usr/bin/llvm-mc -arch thumb --disassemble
        .text
<stdin>:1:2: warning: invalid instruction encoding
 0x00 0xcd 

I had to add a workaround to my capstone front-end to get the size right.
*** DONE capstone bug: OP_msr and OP_mrs privileged forms not listing regs for capstone
    CLOSED: [2015-04-16 Thu 14:28]
    - State "DONE"       from "TODO"       [2015-04-16 Thu 14:28]

  0x00126048:   3167f30b   msr.cc $0x17 %r11 -> %spsr 
  0x00126048:  3167f30b   msrlo sp_fiq, #3 

    0x000fa4e4:   8167f202   msr.hi $0x07 %r2 -> %spsr 
0x000fa4e4:  8167f202   msrlo   r12_usr, #8 

    0x00024638:   4129f207   msr.mi $0x09 %r7 -> %cpsr 
0x00024638:  4129f207   msrmi   r9_fiq, r7 

    +0x12234:   e163f205   msr    $0x03 %r5 -> %spsr 
0x00012234:  e163f205   msreq   , #0xe 

Register form of OP_msr is 0x.1{6,2} (f)(0)0 Rn w/ bit9 as hard 0.
Banked system form has bit9 as 1 => 0x2..

Not sure why llvm thinks they're invalid as the manual only talks about n==15:
# echo ' 0x0b 0xf3 0x67 0x31 ' | /usr/bin/llvm-mc -arch arm --disassemble
<stdin>:1:2: warning: invalid instruction encoding
 0x0b 0xf3 0x67 0x31 
# echo '0x05 0xf2 0x63 0xe1' | /usr/bin/llvm-mc -arch arm -mcpu=cortex-a15 --disassemble  
        .text
<stdin>:1:1: warning: invalid instruction encoding

OK the BankedRegisterAccessValid() has some "unpredictable" cases.

But even passing 0 for the M:M1:
# echo ' 0x0b 0xf2 0x60 0xe1 ' | /usr/bin/llvm-mc -arch arm -mcpu=cortex-a15 --disassemble
        .text
<stdin>:1:2: warning: invalid instruction encoding
 0x0b 0xf2 0x60 0xe1 

And I can't get capstone to list the reg:
# echo ' 0x0b 0xf2 0x60 0xe1 ' | /extsw/pkgs/disasm/capstone/build/capstone -arm -
0x00000000:  e160f20b   msreq   sp_fiq, #0xe

(gdb) set {unsigned char[400]}0x04311a00 = { 0x0b, 0xf2, 0x60, 0xe1 }
(gdb) x/2i 0x04311a00
   0x4311a00:   msr     (UNDEF: 96), r11

Similar for mrs:
    0x00005194:   114a0300   mrs.ne %spsr $0x1a -> %r0 
0x00005194:  114a0300   mrslo   r0, r9_usr 

@aquynh
Copy link

aquynh commented Apr 17, 2015

Capstone reuses LLVM code, hence you see the similar result for ARM arch.

regarding bugs, please consider opening issues at https://github.com/aquynh/capstone/issues.
the other choice is to report disassembler issues directly to LLVM, and we will port the fixes back to Capstone.

thanks.

derekbruening added a commit that referenced this issue Jun 1, 2017
Adds three new cross-platform instruction creation macros,
XINST_CREATE_add_sll(), XINST_CREATE_jump_cond() and XINST_CREATE_slr_s(),
for use in drcachesim.

For XINST_CREATE_jump_cond(), we add aliases so that the same DR_PRED_*
constants can be used for x86 as are used on aarchxx.

Adds x86 tests.  The infrastructure for easily adding ARM (#1686) and
AArch64 (#2443) tests is still missing, unfortunately.
egrimley pushed a commit that referenced this issue Jul 3, 2017
Updated log message:

i#2465 A32 decoder: Add some missing SIMD encodings.

Replace some "INVALID" lines in A32_ext_simd8 with missing encodings:
OP_vbic_{i16,i32} and OP_vmov_f32.

Also correct the opcodes in A32_ext_bit19.

Xref #1686.

Change-Id: I150ddc01484a7cbf5a866d8ab40940ebe7a9311c
egrimley pushed a commit that referenced this issue Jul 4, 2017
Replace some "INVALID" lines in A32_ext_simd8 with missing encodings:
OP_vbic_{i16,i32} and OP_vmov_f32.

Also correct the opcodes in A32_ext_bit19.

Xref #1686.
derekbruening added a commit that referenced this issue Oct 14, 2021
Adds missing required-1 bits in the ARM encoding table entries for
OP_blx, OP_bx, and OP_bxj.  Without the bits, some hardware still
accepts the instructions (which is why we did not notice the problem
before), but they are technically unsound, and QEMU thinks they are
invalid, breaking some of our tests under QEMU.

Tested on QEMU with the forthcoming #2414 drwrap-drreg-test,
and directly with several other decoders:
  Prior encoding for "blx r11":
    <stdin>:1:1: warning: invalid instruction encoding
    0x3b 0x00 0x20 0xe1
    ^
    llvm-mc:   e120003b
    capstone:  e120003b <INVALID: errcode 0>
    bfd:       e120003b ; <UNDEFINED> instruction: 0xe120003b
  New encoding:
    $ disasm_a32 e12fff3b
    llvm-mc:   e12fff3b blx r11
    capstone:  e12fff3b blx r11
    bfd:       e12fff3b blx fp

Setting up more external-decoder testing is beyond the scope of this
fix: #1686 covers that.

Issue: #4719, #1686, #2414
derekbruening added a commit that referenced this issue Oct 15, 2021
Adds missing required-1 bits in the ARM encoding table entries for
OP_blx, OP_bx, and OP_bxj.  Without the bits, some hardware still
accepts the instructions (which is why we did not notice the problem
before), but they are technically unsound, and QEMU thinks they are
invalid, breaking some of our tests under QEMU.

Tested on QEMU with the forthcoming #2414 drwrap-drreg-test,
and directly with several other decoders:
  Prior encoding for "blx r11":
    <stdin>:1:1: warning: invalid instruction encoding
    0x3b 0x00 0x20 0xe1
    ^
    llvm-mc:   e120003b
    capstone:  e120003b <INVALID: errcode 0>
    bfd:       e120003b ; <UNDEFINED> instruction: 0xe120003b
  New encoding:
    $ disasm_a32 e12fff3b
    llvm-mc:   e12fff3b blx r11
    capstone:  e12fff3b blx r11
    bfd:       e12fff3b blx fp

Setting up more external-decoder testing is beyond the scope of this
fix: #1686 covers that.

Issue: #4719, #1686, #2414
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants