It's a vulnerability search tool for cross-platform binary. Given a vulnerability function f
, BinSeeker- can identify whether a binary program contains the same vulnerability as f
. Currently, it support three architectures, such as X86, ARM32, MIPS32.
To use BinSeeker-, we need the following tools installed
- IDA Pro - for generating the LSFG (data flow graph and control flow graph)and extracting features of basic blocks
- python2.7 - all the source code is written in python2.7
- miasm - for converting assembly program to IR. We extend it to support more assembly instructions. Please directly copy the
miasm2
provided by us to the python directory ofIDA Pro
.
0_Libs/search_program
: it contains the binary file considered as the target from which BinSeeker search vulnerability.1_Features/search_program
: it contains the instruction features, control flow graph and data flow graph for each function in the target.5_CVE_Feature
: It contains the instruction features, control flow graph and data flow graph of each version of the two vulnerabilities (CVE-2014-3508, CVE-2015-1791).6_Search_TFRecord
: Tfrecord data file is a binary file that stores data and labels in a unified way. It can make better use of memory and make rapid replication, movement, reading and storage in tensorflow.7_Search_Result
: All the search result list will be stored here.
- We need modify the
config.py
file. All the dependency directories can be modified here. Simple modification is listed as following, but it need to follow the directory structure we defined:
IDA32_DIR = "installation directory of 32-bit IDA Pro program"
IDA64_DIR = "installation directory of 64-bit IDA Pro program"
- We put the programs to be searched in the
0_Libs/search_program
directory. - We run the
command.py
file to generate the labeled semantic flow graphs and extract initial numerical vectors for basic blocks. The result files should be placed in the1_Features/search_program
directory. - We execute the
7_search_by_list_binseeker
file to obtain embedding vectors of the functions and get the function list in descending order of similarity scores.
Note: All steps can be executed in the Linux system.
If you have an appropriate Python-2.7 version, you can skip this installation. Please make sure that you have installed Python with ucs4 unicode encoding. You can identify ucs2 and ucs4 with the following code.
>> import sys
>>print sys.maxunicode
1114111# it means the ucs4 encoding
65535# it means the ucs2 encoding, you need reinstall your python. The tensorflow-1.1.0 requires the ucs4 unicode encoding style.
- install required libraries, or it will cause some troubles.
sudo apt-get install python-dev libffi-dev libssl-dev libxml2-dev libxslt-dev libmysqlclient-dev libsqlite3-dev zlib1g-dev libgdbm-dev
- download and install Python-2.7.13
wget -c https://www.python.org/ftp/python/2.7.13/Python-2.7.13.tar.xz xz -d Python-2.7.13.tar.xz tar xf Python-2.7.13.tar cd Python-2.7.13 ./configure --prefix=/usr/local/python2713 --enable-unicode=ucs4 make make install
- install setuptools and pip package
wget https://bootstrap.pypa.io/ez_setup.py -O - | sudo python curl -O https://bootstrap.pypa.io/get-pip.py python get-pip.py
- link pip and python to bin path
rm /usr/bin/pip2 rm /usr/bin/pip2 ln -s /usr/local/python2713/bin/pip /usr/bin/pip2 ln -s /usr/local/python2713/bin/pip /usr/bin/pip rm /usr/bin/python rm /usr/bin/python2 ln -s /usr/local/python2713/bin/python /usr/bin/python2 ln -s /usr/local/python2713/bin/python /usr/bin/python
- add environment variables
export PATH="$PATH:/usr/local/python2713/lib/python2.7/site-packages:/usr/local/python2713/bin"
If you want to train your own network model, you need to install tensorflow-1.1.0 version. We build this version of tensorflow from source code. The following is the detailed installation instructions (for cpu-only tensorflow) on the ubuntu14 machine.
- install dependent packages
sudo apt-get install zlib1g-dev swig python-wheel pkg-config zip g++ unzip python-numpy python-dev
wget https://pypi.python.org/packages/c8/0a/b6723e1bc4c516cb687841499455a8505b44607ab535be01091c0f24f079/six-1.10.0-py2.py3-none-any.whl#md5=3ab558cf5d4f7a72611d59a81a315dc8 #download and install six
sudo pip install six-1.10.0-py2.py3-none-any.whl
sudo pip install networkx
sudo pip install pyparsing
sudo pip install numpy
- install bazel building tool
- Download
bazel-0.4.2-installer-linux-x86_64.sh
from https://github.com/bazelbuild/bazel . chmod +x bazel-0.4.2-installer-linux-x86_64.sh
./bazel-0.5.4-installer-linux-x86_64.sh --user
- add bazel file path to the PATH environment variable. e,g,:
export PATH="$PATH:$HOME/bin"
- install java8/openjdk8
sudo add-apt-repository ppa:openjdk-r/ppa sudo apt-get update sudo apt-get install openjdk-8-jdk sudo update-alternatives --config java #note: select the appropriate version sudo update-alternatives --config javac
- install tensorflow
- $git clone --recurse-submodules https://github.com/tensorflow/tensorflow.git -b r1.1 #download source code,--recurse-submodules is used for downloading the dependent tools,-b r1.1 means the tensorflow-1.1.0 version.
- enter the tensorflow directory and then select the python path
e.g.,
./condigure /usr/bin/python
note: the following is the selection during the installation process.
malloc implementation: Y
Google Cloud Platform support: N
Hadoop File System support: N
XLA just-in-time compiler: N
Python library paths: Default is [/usr/local/lib/python2.7/dist-packages],you can select a different path.
OpenCL support: N
CUDA support: N
Configuration finished.
- execute
bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
to build the tensorflow source code - execute
./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
to get the installation wheel tensorflow-1.1.0-cp27-cp27mu-linux_x86_64.whl - install the tensorflow package
sudo pip install /tmp/tensorflow_pkgtensorflow-1.1.0-cp27-cp27mu-linux_x86_64.whl
, it will also install funcsigs mock pbr protobuf. - verify the installation
$python
>>import tensorflow as tf
>>hello=tf.constant('Hello, tensorflow!')
>>sees=tf.Session()
>>print sees.run(hello)
Hello, tensorflows!
>>a=tf.constant(10)
>>b=tf.constant(32)
>>print sees.run(a+b)
42
It is consistent with the usage described above.