Skip to content

Commit

Permalink
[Capstone2llvmir] Update to Capstone V5 (#1059)
Browse files Browse the repository at this point in the history
* Update Capstone to v4.0

* [Capstone-next] Update to capstone-next branch

* [Capstone-next] Update to Capstone-Next Branch
-[ARM]
    -Add ARM_INS_MOVS support
-[ARM64]
    -Remove vess.
        -It overlaps with ARM64_VAS
    -Fix A64SysReg_* into ARM64_SYSREG_*
-[PowerPC]
    -Fix PPC_REG_X2 into PPC_REG_XER
-[X86]
    -Remove X86_INS_FADDP
        -In capstone-next, faddp is actually fadd, both belong to
            "ID 15(fadd)"

* [tests][capstone2llvmir][arm] Fix MOVW Unit Test
- In test, "movw r0, #0xabcd" do not read any register
    and the result is 0xabcd not 0x1234abcd

* [tests][capstone2llvmir][arm] Fix Nop test
- In arm, the NOP instruction is HINT instruction
- Also, in capstone, the cs_insn->id of nop is point to
    HINT(ID: 63)
- So, an error will be occurred when looking for a translate
    instruction method because it is points to nullptr

* [Capstone2llvmir][arm64] Add ADDCS Support

* [capstone2llvmir][arm64] Add ADDS Support

* [capstone2llvmir][arm64] Add ANDS Support

* [capstone2llvmir][arm64] Add SUP Support

* [capstone2llvmir][arm64] Add BICS Support

* [capstonellvmir][PowerPC] Update Register Name

* [capstone2llvmir][PowerPC] Update Register Name

* [capstone2llvmir][PowerPC] Fix CMP Support

* [capstone2llvmir][PowerPC] Add CMPL Support

* [capstone2llvmir][PowerPC] Fix CMPL

* [capstone2llvmir][PowerPC] Add BLT Support

* [capstone2llvmir][PowerPC] Add  Branch mnemonics incorporating
conditions Suppport

* [capstone2llvmir][PowerPC] Fix RLWINM
- RLWINM and clrlwi are same ID

* [tests][capstone2llvmir][PowerPC] Fix Crand Tests

* [capstone2llvmir][PowerPC] Fix bdzla BUG

* [capstone2llvmir][PowerPC] Remove BDZLA TODO

* [capstone2llvmir][x86] Fix ud2b

* [capstone2llvmir][X86] Fix FADD/FADDP

* [capstone2llvmir][x86] Fix FADD/FADDP

* [capstone2llvmir][x86] Fix FXCH
- when transalte "FXCH instruction, in the value of loadOpFloatingBinaryTop Function,
    "top" is equal to idx, which causes the value to be written to top
    twice when exchanging data.

* clean code

* Update Capstone to v5.0

* [capstone2llvmir][x86][PowerPC] Clean code

* [capstone2llvmir][PowerPC] Clean code

* [capstone2llvmir][PowerPC] Remove BUN* and BNU*
-In CapstoneV5, they are both equivalent to BSO* and BNS*

* [capstone2llvmir][PowerPC] Fix rlwinm
- In capstone V5, rlwinm is equivalent to to clrlwi

* [capstone2llvmir][PowerPC] Fix BNL*

* [capstone2llvmir][PowerPC] Add PPC_REG_ZERO

* [capstone2llvmir][PowerPC] Add comment

* Fix merge conflict

* Update YARA to 4.2.X

* Add dll_name from export directory to output

* llvm/CMakeLists: Manually-specified variables were not used by the project.

The following variables were set in CMakeLists, however, they
were not used by the LLVM project build:

LLVM_USE_CRT_DEBUG
LLVM_USE_CRT_RELEASE

* CHANGELOG.md: add entries for #1060 #1061 PRs

* Fixed loading import directory that is modified by relocations

* Fixed comment

* Remove useless trailing whitespace

There is absolutely no reason for it being in the code.

* pelib: Fix a typo in a comment in PeLib::ImageLoader::Load()

* Add a CHANGELOG entry for #1063

* Move signing certificate to separate object

* Updated authenticode parser to the newest version

* Fix uninitialize free, use finer sanity checks in auth. parser

* Add a directory for RetDec-related publications

The list of publications has been originally placed on
https://retdec.com/publications/ (https://retdec.com/ has been redirected
to https://github.com/avast/retdec, and we wanted to keep the list somewhere).

* Fix the wording for an invalid max-memory error in scripts/retdec-unpacker.py

There are the following two reasons for the fix:
- The check only verifies whether the passed value is an integer.
- The parameter can be 0 (i.e. a non-negative integer). It does not have to a
  positive integer.

* Never try to limit memory on macOS

We can't limit memory on macOS. Before macOS 12
limitSystemMemoryOnPOSIX() does not actually do anything on macOS.
Anyway, it just succeed, since macOS 12 it returns error and retdec
can't start.

To be honest Apple can control memmory limit via so-called the ledger()
system call which is private. An old verison which was opened to
OpenSource (from 10.9-10.10?) using setrlimit() but at some point
setrlimit() was broken and not ledger(). Probably at macOS 12 the
setrlimit() was completley broken.

Because we haven't got any other choose just return true which haven't
change anything.

See: #379
Fixes: #1045

* Remove a redundant period from CHANGELOG

* utils: Improve the wording of a comment in getTotalSystemMemoryOnMacOS()

* Add a CHANGELOG entry for #1074 and #1045

* Update authenticode-parser, use-after-free, signedness issues

* Using multistage build for Dockerfile, reduces container size by ~1.5G

* Check for possible overflow when checking for segment overlaps. Fix incorrect range exception message

* Fix parameter and return types for dynamically called functions

Calls to dynamically-linked functions go through the procedure linkage
table (PLT).  RetDec turns a PLT entry into a function, say
malloc@plt, that appears to do nothing but call the external function,
say malloc (though the assembly code will do a jump rather than a
call). User code that logically wants to call malloc instead calls
malloc@plt (and sets up arguments as if calling malloc). The
malloc@plt code first jumps to the dynamic linker which modifies it so
that subsequent calls to malloc@plt will jump directly to malloc. We
say that malloc@plt wraps malloc.  The call to malloc in malloc@plt
will not have any arguments setup, so malloc will appear to have
no parameters or returns (unless that information is provided by
link-time-information, debug information, or name demangling), but it
needs to have the same parameter types and return type as
malloc@plt. The propagateWrapped methods copy the argument information
from the DataFlowEntry of the wrapping function to the wrapped
function. Then, when the calls to the wrapping function are inlined
(in connectWrappers), effectively the call to the wrapping function is
changed into a call to the wrapped function.

The motivation for this change is the programs that analyze the
output of RetDec (either the C code, or the LLVM code) want to
recognize library functions and treat them specially. This
change makes it so that the library function names are used
directly (rather than the plt version) and they are passed
their parameters correctly.

* Upgrade to Capstone release 4.0.2

* Add additional patch on capstone 4.0.2 for PPC Signed 16 bit immediates

Capstone version 4.0.2 has a bug when disassembling a powerpc instruction
with a signed 16-bit immediate.
See capstone-engine/capstone#1746 and
capstone-engine/capstone#1746 (comment).

This change adds to the capstone patch to fix this problem.

* Treat endbr32/endbr64 instructions as NOPs

* capstone2llvmir/powerpc: remove PPC_INS_BDZLA hack fix

As Capstone was updated, the fix in capstone-engine/capstone#968 took effect and the original RetDec fix is not needed - in fact, it caused problems.

* Handle Procedure Linkage calls for 32bit x86 from gcc

This case is for x86 32 bit compiled with GCC. Its PLT entries are in
sections .plt.sec or .plt.got. An entry is of the form:

jmp *offset(%ebx)

When this code is encountered register %ebx has been loaded with the
address of the start of the Global Offset Table (.got) section.
This change handles that case.

* Add ability to process PNG icons for perceptual hash calculation (#1090)

* Add ability to process PNG icons for perceptual hash calculation

* Use SCOPE_EXIT for deallocation

* In generated C, add prototypes for dynamically-linked functions without headers

When the program involves dynamically-linked functions like _Znwj
(operator new) that return a pointer, it is necessary to have
prototypes for them, since otherwise they will be implicitly deduced
to return "int" which cannnot be dereferenced.

Previously RetDec was emitting comments telling which functions were
dynamically linked. This change moves them up before the functions are
emitted and instead emits prototypes for the functions. However,
RetDec also inserts includes of headers for functions for with know
headers. We do not emit prototypes for functions with headers as that
would be redundant.  As a result, some dynamically-linked functions
that used to show in the comments no longer appear as the included
header will declare them.

The section header comment for dynamically-linked functions is only
produced if some prototypes are written for dynamically-linked
functions.

A related PR will have added tests as well as changes needed for
existing tests.

* Add printing of analysis time to retdec-fileinfo output

* Yara: inherits linker flags

* Use provided libtool via `CMAKE_LIBTOOL`

* Added missed `${RETDEC_INSTALL_BIN_DIR}` to `pat2yara`

* Added sanity check for page index when loading pages from broken samples

There are certain samples where page index might go beyond available
pages when trying to load them which will be prevented with this patch.

* Virtual Size overflow is now handler properly

* Fixed error code

* Updated yaramod

* Fix removeZeroSequences

* README.md: add "limited maintenance mode" note

Co-authored-by: Peter Kubov <peter.kubov@avast.com>
Co-authored-by: houndthe <houndthe@protonmail.com>
Co-authored-by: Peter Matula <peter.matula@avast.com>
Co-authored-by: Ladislav Zezula <ladislav.zezula@avast.com>
Co-authored-by: Petr Zemek <petr.zemek@avast.com>
Co-authored-by: Marek Milkovič <marek.milkovic@avast.com>
Co-authored-by: Kirill A. Korinsky <kirill@korins.ky>
Co-authored-by: me <me>
Co-authored-by: Richard L Ford <richardlford@gmail.com>
Co-authored-by: 未赢 <26459963+neverwin@users.noreply.github.com>
  • Loading branch information
10 people authored Dec 5, 2022
1 parent f76d200 commit 23ecab3
Show file tree
Hide file tree
Showing 14 changed files with 2,292 additions and 2,221 deletions.
5 changes: 2 additions & 3 deletions cmake/deps.cmake
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@

# URL is for Capstone release 4.0.2.
set(CAPSTONE_URL
"https://github.com/capstone-engine/capstone/archive/1d230532840a37ac032c6ab80128238fc930c6c1.zip"
"https://github.com/aquynh/capstone/archive/f049e65f596bf8b1cbf5f2371067e34715ef1764.zip"
CACHE STRING "URL of Capstone archive to use."
)
set(CAPSTONE_ARCHIVE_SHA256
"659097fcda59ce927937f73dd87a4606de6e768b352045a077ed8d2165b7e935"
"87fe97225ee98220dcb5725bc470bc83a67819a6e75000075566c0423599437e"
CACHE STRING ""
)

Expand Down
1 change: 1 addition & 0 deletions deps/capstone/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ ExternalProject_Add(capstone-project
"${CMAKE_C_COMPILER_OPTION}"
"${CMAKE_CXX_COMPILER_OPTION}"
-DCMAKE_POSITION_INDEPENDENT_CODE=${CMAKE_POSITION_INDEPENDENT_CODE}
-DCAPSTONE_INSTALL=ON
-DCMAKE_LIBTOOL=${CMAKE_LIBTOOL}
# Patch the Capstone sources.
PATCH_COMMAND
Expand Down
46 changes: 0 additions & 46 deletions include/retdec/capstone2llvmir/powerpc/powerpc_defs.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,52 +7,6 @@
#ifndef RETDEC_CAPSTONE2LLVMIR_POWERPC_POWERPC_DEFS_H
#define RETDEC_CAPSTONE2LLVMIR_POWERPC_POWERPC_DEFS_H

enum ppc_reg_cr_flags
{
/// Negative -- set when result is negative.
PPC_REG_CR0_LT = PPC_REG_ENDING + 1,
/// Positive -- set when result is positive and not zero.
PPC_REG_CR0_GT,
/// Zero -- set when result is zero
PPC_REG_CR0_EQ,
/// Copy of the final state of XER[SO] at the completion of the instruction.
PPC_REG_CR0_SO,

PPC_REG_CR1_LT,
PPC_REG_CR1_GT,
PPC_REG_CR1_EQ,
PPC_REG_CR1_SO,

PPC_REG_CR2_LT,
PPC_REG_CR2_GT,
PPC_REG_CR2_EQ,
PPC_REG_CR2_SO,

PPC_REG_CR3_LT,
PPC_REG_CR3_GT,
PPC_REG_CR3_EQ,
PPC_REG_CR3_SO,

PPC_REG_CR4_LT,
PPC_REG_CR4_GT,
PPC_REG_CR4_EQ,
PPC_REG_CR4_SO,

PPC_REG_CR5_LT,
PPC_REG_CR5_GT,
PPC_REG_CR5_EQ,
PPC_REG_CR5_SO,

PPC_REG_CR6_LT,
PPC_REG_CR6_GT,
PPC_REG_CR6_EQ,
PPC_REG_CR6_SO,

PPC_REG_CR7_LT,
PPC_REG_CR7_GT,
PPC_REG_CR7_EQ,
PPC_REG_CR7_SO,
};

enum ppc_cr_types
{
Expand Down
1 change: 1 addition & 0 deletions src/capstone2llvmir/arm/arm_init.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -466,6 +466,7 @@ Capstone2LlvmIrTranslatorArm_impl::_i2fm =
{ARM_INS_MLA, &Capstone2LlvmIrTranslatorArm_impl::translateMla},
{ARM_INS_MLS, &Capstone2LlvmIrTranslatorArm_impl::translateMls},
{ARM_INS_MOV, &Capstone2LlvmIrTranslatorArm_impl::translateMov},
{ARM_INS_MOVS, &Capstone2LlvmIrTranslatorArm_impl::translateMov},
{ARM_INS_MOVT, &Capstone2LlvmIrTranslatorArm_impl::translateMovt},
{ARM_INS_MOVW, &Capstone2LlvmIrTranslatorArm_impl::translateMovw},
{ARM_INS_MRC, nullptr},
Expand Down
26 changes: 19 additions & 7 deletions src/capstone2llvmir/arm64/arm64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -378,23 +378,35 @@ llvm::Value* Capstone2LlvmIrTranslatorArm64_impl::extractVectorValue(
}

// Vector element size specifier
switch(op.vess)
switch(op.vas)
{
case ARM64_VESS_B:
case ARM64_VAS_16B:
case ARM64_VAS_8B :
case ARM64_VAS_4B :
case ARM64_VAS_1B :
val = irb.CreateLShr(val, llvm::ConstantInt::get(val->getType(), 8 * op.vector_index));
return irb.CreateZExtOrTrunc(val, llvm::IntegerType::getInt8Ty(_module->getContext()));
case ARM64_VESS_H:
case ARM64_VAS_8H:
case ARM64_VAS_4H:
case ARM64_VAS_2H:
case ARM64_VAS_1H:
val = irb.CreateLShr(val, llvm::ConstantInt::get(val->getType(), 16 * op.vector_index));
return irb.CreateZExtOrTrunc(val, llvm::IntegerType::getInt16Ty(_module->getContext()));
case ARM64_VESS_S:
case ARM64_VAS_4S:
case ARM64_VAS_2S:
case ARM64_VAS_1S:
val = irb.CreateLShr(val, llvm::ConstantInt::get(val->getType(), 32 * op.vector_index));
val = irb.CreateZExtOrTrunc(val, llvm::IntegerType::getInt32Ty(_module->getContext()));
return irb.CreateBitCast(val, llvm::Type::getFloatTy(_module->getContext()));
case ARM64_VESS_D:
case ARM64_VAS_1D:
val = irb.CreateLShr(val, llvm::ConstantInt::get(val->getType(), 64 * op.vector_index));
val = irb.CreateZExtOrTrunc(val, llvm::IntegerType::getInt64Ty(_module->getContext()));
return irb.CreateBitCast(val, llvm::Type::getDoubleTy(_module->getContext()));
case ARM64_VESS_INVALID:
case ARM64_VAS_1Q:
val = irb.CreateLShr(val, llvm::ConstantInt::get(val->getType(), 128 * op.vector_index));
val = irb.CreateZExtOrTrunc(val, llvm::IntegerType::getInt128Ty(_module->getContext()));
return irb.CreateBitCast(val, llvm::Type::getFP128Ty(_module->getContext()));
case ARM64_VAS_INVALID:
return val;
default:
throw GenericError("Arm64: extractVectorValue(): Unknown VESS type");
Expand Down Expand Up @@ -1775,7 +1787,7 @@ void Capstone2LlvmIrTranslatorArm64_impl::translateAnd(cs_insn* i, cs_arm64* ai,
std::tie(op1, op2) = loadOpBinaryOrTernaryOp1Op2(ai, irb);
op2 = irb.CreateZExtOrTrunc(op2, op1->getType());

if (i->id == ARM64_INS_BIC)
if (i->id == ARM64_INS_BIC || i->id == ARM64_INS_BICS)
{
op2 = generateValueNegate(irb, op2);
}
Expand Down
Loading

0 comments on commit 23ecab3

Please sign in to comment.