We present Einstein, a data-only attack exploitation pipeline that uses dynamic taint analysis policies to: (i) scan for chains of vulnerable system calls (e.g., to execute code or corrupt the filesystem), and (ii) generate exploits for those that take unmodified attacker data as input.
Einstein discovers thousands of vulnerable syscalls in common server applications.
Using nginx
as a case study, we use Einstein to generate hundreds of mitigation-bypassing exploits.
You can find the full paper here.
Although Einstein indeed produces working exploits, they are non-destructive proof-of-concept exploits, which write the string "HELLO"
to either a file ("/tmp/hi"
) or a local socket (address "192.0.2.0"
).
Hence, evaluating Einstein poses no risks for machine security, data privacy, or other ethical concerns.
The files for the artifact evaluation are available at the ae
tag of the repository.
Einstein requires an x86-64 machine (Intel recommended); enough RAM to simultaneously load multiple program snapshots into memory, so Einstein can post-process reports in parallel (recommended 100 GB RAM); and enough storage for hundreds of program snapshots (minimum 2 TB storage for this evaluation). We recommend using a machine with a high core count to speed up Einstein's report post-processing.
To build Einstein and the target programs, we expect certain packages to be installed. In the Set-up section, we detail the steps to install such dependencies on Ubuntu 22.04, but similar steps are needed for other distributions.
We use each target application's test suite to drive the analysis.
To download and install dependencies, including go-task as a task-runner, from this repository, run: sudo snap install task --classic && task init
.
To build libdft, the command server, the Einstein tool, and all target applications, run: task libdft-build cmdsvr-build einstein-build apps-build
.
We first make a couple notes about running Einstein:
- Due to the non-deterministic nature of dynamic analysis (from concurrency issues, system variability, etc.) [1,2], the actual results may slightly deviate from the expected results.
- If the
db-analyze-reports
task fails, try running thedb-analyze-reports-singleproc
task instead. It will be slower, but will avoid any system load-related crashes.
Test that the different components work as follows:
- (T1): libdft memory tainting [1 compute-second].
To test libdft's "taint all memory" functionality, runtask libdft-test -- memtaint
and compare its output to the expected output. - (T2): libdft instruction tainting [1 compute-second].
To test libdft's per-instruction taint policies, runtask libdft-test -- ins
and compare its output to the expected output. - (T3): Einstein tool [1 compute-minute].
To test Einstein on a simple program, runtask einstein-test
. Then, compare the output oftask db-print-candidates
with the expected output. - (T4): Target applications [4 compute-minutes].
To test Einstein running each target application with a simple workload (e.g., sending a simple GET request to a web server), runtask reports-clean apps-test db-add-reports db-analyze-reports
. Then, compare the output oftask db-print-candidates
with the expected output. - (T5): Target application test suites [20 compute-minutes].
To test Einstein running each target applications' test suites for 2 minutes each (rather than the entire test suites), runtask reports-clean apps-eval-brief db-add-reports db-analyze-reports
. Then, compare the output oftask db-print-candidates
with the expected output. - (T6): Exploit generation [2 compute-minutes].
To test Einstein's exploit confirmation fornginx
, runtask reports-clean einstein-nowrite-config nginx-eval-custom db-add-reports db-analyze-reports db-analyze-candidates
. Then, compare the output oftask db-print-exploits
with the expected output.
We make the following claims:
- (C1): Einstein identifies thousands of vulnerable syscalls in common server applications. This is proven by Experiment (E1).
- (C2): Einstein generates hundreds of working exploits against
nginx
. This is proven by Experiment (E2).
We prove the above claims using the following experiments:
-
(E1): Vulnerable syscall identification [24 compute-hours].
- How to: We will run each application with Einstein, then analyze the reports to identify vulnerable syscalls.
- Preparation: Run
task reports-clean
to remove past reports. - Execution: Run
task apps-eval db-add-reports db-analyze-reports
. - Results: Compare the output of
task db-print-candidates
to the expected output. The output contains thousands of vulnerable gadgets, broken down by: (i) syscall and argument type (i.e., Table 3), and (ii) target application (i.e., Table 4)—thereby proving Claim (C1).
-
(E2): Exploit generation [12 compute-hours].
- How to: We will run
nginx
with Einstein, then analyze the reports to identify vulnerable syscalls, then confirm candidate exploits as working exploits. - Preparation: Run
task reports-clean
to remove past reports. - Execution: Run
task nginx-eval db-add-reports db-analyze-reports db-analyze-candidates
. - Results: Compare the output of
task db-print-exploits
to the expected output. The output contains hundreds of confirmed exploits fornginx
(i.e., Table 5)—thereby proving Claim (C2).
- How to: We will run
This prototype may be expanded in a few directions:
- To modify Einstein's taint policies (e.g, to target more syscalls, or to target syscall-guard variables), modify the Einstein tool in
src/einstein
. - To run the target applications (e.g.,
nginx
) with other workloads, first start the application with Einstein (cd apps/nginx-1.23.0 && RUN_EINSTEIN=1 ./serverctl restart
), then run the custom workload (e.g.,echo 'Hello!' | netcat 127.0.0.1 1080
). - To run Einstein on other applications:
- (i) Add the application to the
apps/
directory; - (ii) Copy the files
serverctl
andclientctl
from another application's directory into its directory, and modify them to start the application's server and a client for it; and - (iii) Ensure that the application's build script generates position-independent code (i.e., the default on most compilers).
- (i) Add the application to the
- To write another Pin tool that uses libdft64-ng:
- (i) Copy the Einstein tool, e.g.:
cp -r src/einstein src/my-tool
; - (ii) Modify
MY_TOOL
andMY_OBJS
in theMakefile
; - (iii) Modify the source code to suite your analysis;
- (iv) Build it:
cd src/my-tool && -DLIBDFT_TAG_PTR -DLIBDFT_PTR_32 -DLIBDFT_TAG_SSET_MAX=16' make obj-intel64/my-tool.so
; and - (v) Run it on some target application:
setarch x86_64 -R ./src/misc/pin-3.28-98749-g6643ecee5-gcc-linux/pin -t src/my-tool/obj-intel64/my-tool.so -- echo 'Hello!'
.
- (i) Copy the Einstein tool, e.g.: