Fixes lifting disassembler bug for Pcmp #1222

percontation · 2020-09-21T12:23:55Z

There's a decoding error in some Pcmp instructions, and the same type error in Pcmp & Ppackedbinop semantics (left-shift of a vector-register-sized operand by an 8-bit shift value).

percontation · 2020-09-21T12:38:07Z

(Am I correct in thinking bil wants both sides of shifts to have the same type? If that has changed, it could just be that I'm working with older code that wants to make that assumption. Either way, the pcmp decoder error fixed here is definitely a problem.)

ivg · 2020-09-21T13:44:23Z

Am I correct in thinking bil wants both sides of shifts to have the same type?

Nope, they can have different types, no problem here.

ivg

Could you please remove the .merlin files and provide some references that identify what "some bugs" you're fixing? E.g., what is the expected behavior and how the current behavior differs from it, and how it is reflected in the ISA manual.

lib/.merlin

plugins/x86/x86_lifter.ml

ivg · 2020-09-22T13:25:36Z

plugins/x86/x86_disasm.ml

@@ -999,9 +999,9 @@ let parse_instr mode mem addr =
          (Punpck(prefix.mopsize, elemt, order, r, rm, rv), na)
        | 0x64 | 0x65 | 0x66 | 0x74 | 0x75 | 0x76  as o ->
          let r, rm, rv, na = parse_modrm_vec None na in
-          let elet = match o land 0x6 with | 0x4 -> reg8_t | 0x5 -> reg16_t | 0x6 -> reg32_t | _ ->
+          let elet = match o land 0x0F with | 0x4 -> reg8_t | 0x5 -> reg16_t | 0x6 -> reg32_t | _ ->


any references or explanations?

it was a typo, this is what the author meant.

(you can check the behavior against http://ref.x86asm.net/coder64.html ; pretty clearly the case that leaves elet=reg16_t isn't supposed to be dead code :P )

percontation · 2020-09-23T19:47:24Z

Am I correct in thinking bil wants both sides of shifts to have the same type?

Nope, they can have different types, no problem here.

Ah, ok. I'll note that I think this type difference is causing primus to send our code stuff like extract(hi=31, lo=0, <8-bit constant>). Should it always be the case that over-extraction is interpreted as "zero pad to recover non-existent bits"?

There's a typo in the disassembler.

ivg · 2020-09-24T15:32:11Z

Thanks!

ivg · 2020-09-24T15:54:32Z

Am I correct in thinking bil wants both sides of shifts to have the same type?

Nope, they can have different types, no problem here.

Ah, ok. I'll note that I think this type difference is causing primus to send our code stuff like extract(hi=31, lo=0, <8-bit constant>).

Yes, since 2.1.0 we relaxed the type system of the Primus interpreter and it now permits binary operations, including the arithmetic operations, on operands with different sizes. To enable this, the interpreter coerces operands to the same bitwidth. We apply this coercion to all operations, even those that permit operands of different sizes, which is probably too conservative and results in unnecessary operations such as extract(hi=31, lo=0, <8-bit constant>) that you mentioned. If those bogus coercions introduce any issues on your side or bother you, I can remove them real quick, no problem here. Do you want to make this coercions more precise?

Should it always be the case that over-extraction is interpreted as "zero pad to recover non-existent bits"?

Yes, if the resulting width is larger than the original it is the same as zero-padding, e.g.,

# #install_printer Word.pp_hex_full;;
# x;;
- : word = 0xA5:8u
# Word.extract_exn ~hi:31 ~lo:0 x;;
- : word = 0xA5:32u

percontation · 2020-09-25T07:28:14Z

👍 Thanks for the info.

In our particular case, removing the "bogus" coercions would actually just cause errors elsewhere in our code, because our expressions use traditional width-strictness in bitvec operands (and the code assumes that Primus is as strict as it was previously).

So what I'd suggest, is that Primus be updated to send the "correct" coercion (cast or extract), rather than omitting them. An approach like that would allow Primus to avoid breaking the API, even if operator definitions are relaxed/otherwise expanded in the future.

ivg · 2020-09-30T19:05:05Z

+1 Thanks for the info.

In our particular case, removing the "bogus" coercions would actually just cause errors elsewhere in our code, because our expressions use traditional width-strictness in bitvec operands (and the code assumes that Primus is as strict as it was previously).

I am not saying about removing all coercions, just removing coercions for shifts, which are not necessary from the perspective of the BIL type system, but indeed are necessary from the perspective of SMTLIB, where shifts are having sort (op2 (_ BitVec m) (_ BitVec m) (_ BitVec m)), so indeed there is some impedance mismatch here.

So what I'd suggest, is that Primus be updated to send the "correct" coercion (cast or extract), rather than omitting them. An approach like that would allow Primus to avoid breaking the API, even if operator definitions are relaxed/otherwise expanded in the future.

In other words, instead of the swiss-knife extract you want to have either cast unsigned m x when we zero extend and extract m 0 x when we do the narrow cast?

I can do this, but here is the caveat. Even if Primus will begin to emit more precise coercions it doesn't guarantee that someone else will not be using extract for zero extending. In fact, all our lifters do this and do this a lot. So anyways you have to handle extracts correctly, because this is the semantics of the extract operation in BIL. For the reference, here is how extract shall be lifted to Z3,

  module Bitv = Z3.BitVector

  let extract hi lo x =
    let xs = Bitv.get_size (Expr.get_sort x)
    and ns = hi-lo+1 in
    if ns > xs
    then if lo = 0
      then Bitv.mk_zero_ext ctxt (ns-xs) x
      else
        Bitv.mk_extract ctxt hi lo @@
        Bitv.mk_zero_ext ctxt (ns-xs) x
    else
      Bitv.mk_extract ctxt hi lo x

See also our symbolic executor formula module that maps Primus observations to Z3 formulae, which we should probably make public.

And sorry for the late reply, I missed somehow the notification :(

percontation force-pushed the pcmp_fix branch from 09273fe to 2340acc Compare September 21, 2020 13:17

percontation changed the title ~~Fixes some bugs in the pcmp semantics~~ Fixes some bugs in the Pcmp & Ppackedbinop semantics Sep 21, 2020

ivg requested changes Sep 21, 2020

View reviewed changes

lib/.merlin Outdated Show resolved Hide resolved

ivg reviewed Sep 22, 2020

View reviewed changes

plugins/x86/x86_lifter.ml Outdated Show resolved Hide resolved

ivg reviewed Sep 22, 2020

View reviewed changes

Fix Pcmp instruction

183e9e7

There's a typo in the disassembler.

percontation force-pushed the pcmp_fix branch from 2340acc to 183e9e7 Compare September 23, 2020 19:57

percontation changed the title ~~Fixes some bugs in the Pcmp & Ppackedbinop semantics~~ Fixes lifting disassembler bug for Pcmp Sep 23, 2020

ivg approved these changes Sep 24, 2020

View reviewed changes

ivg merged commit 4eb72ae into BinaryAnalysisPlatform:master Sep 24, 2020

percontation deleted the pcmp_fix branch November 18, 2020 03:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes lifting disassembler bug for Pcmp #1222

Fixes lifting disassembler bug for Pcmp #1222

percontation commented Sep 21, 2020 •

edited

Loading

percontation commented Sep 21, 2020 •

edited

Loading

ivg commented Sep 21, 2020

ivg left a comment

ivg Sep 22, 2020

percontation Sep 23, 2020

percontation Sep 23, 2020

percontation commented Sep 23, 2020

ivg commented Sep 24, 2020

ivg commented Sep 24, 2020

percontation commented Sep 25, 2020 •

edited

Loading

ivg commented Sep 30, 2020

Fixes lifting disassembler bug for Pcmp #1222

Fixes lifting disassembler bug for Pcmp #1222

Conversation

percontation commented Sep 21, 2020 • edited Loading

percontation commented Sep 21, 2020 • edited Loading

ivg commented Sep 21, 2020

ivg left a comment

Choose a reason for hiding this comment

ivg Sep 22, 2020

Choose a reason for hiding this comment

percontation Sep 23, 2020

Choose a reason for hiding this comment

percontation Sep 23, 2020

Choose a reason for hiding this comment

percontation commented Sep 23, 2020

ivg commented Sep 24, 2020

ivg commented Sep 24, 2020

percontation commented Sep 25, 2020 • edited Loading

ivg commented Sep 30, 2020

percontation commented Sep 21, 2020 •

edited

Loading

percontation commented Sep 21, 2020 •

edited

Loading

percontation commented Sep 25, 2020 •

edited

Loading