-
Notifications
You must be signed in to change notification settings - Fork 10
Description
OCCAM is an LLVM-based whole-program partial evaluator that aims to prune large applications and their library dependencies.
The key idea behind partial evaluation is to take a program P
and a subset of inputs I
(called the deployment context) and produce a new specialized program P'
(called the residual program) such that P' = P(I)
, where P'
is typically faster or smaller than the original program P
.
OCCAM has the following unique combination of features:
-
It can winnow an application and libraries by applying inter-modular specialization. This allows libraries to be specialized with respect to a particular application but still being reusable for other similar applications.
-
The deployment context can be partially specified, allowing users to select the level of specialization.
-
The default usage provides a correctness guarantee. The original and residual programs are behaviorally equivalent -- that is, they will result in the same execution traces -- when the original program is executed with the particular deployment context used during specialization.
(Note: The experimental combination of static and dynamic analysis (in configuration priming) offers more aggressive winnowing, but without the same guarantee.)
OCCAM assumes a closed-world view when reasoning about code dependencies. This means that if a function cannot be called from the application or specified libraries, it is safe to eliminate it.
The architecture of OCCAM is shown in this picture.
The user must provide two different kind of inputs:
-
LLVM bitcode for the application and optionally, a set of bitcode files for all libraries to be specialized. We suggest using gllvm to generate bitcode from existing project sources without modifying their build scripts.
-
The deployment context of the application. This includes: configuration files, input parameters (
argv
variable), environment variables, information about the particular architecture, etc. All this information is provided via a JSON file called the Manifest.
The output of OCCAM is the set of specialized LLVM modules for both the application and the set of libraries.
OCCAM implements the following high-level algorithm:
1. perform configuration priming on the LLVM module that contains main
2. compute interfaces for each LLVM module
3. seal (internalize) functions based on interfaces
4. while change in any LLVM module
5. intra-module specialization of each LLVM module
6. inter-module specialization
7. update interfaces of and seal each LLVM module
8. return specialized LLVM modules
We now describe each step in more detail.
The goal is to transfer the deployment context information from the environment to the actual bitcode so that the rest of the partial evaluation process can benefit from it. For instance, if the values of some input parameters are provided in the Manifest, OCCAM first transforms the main
function so that known argv
values are stored in special global variables. The code is then modified to access to these special variables instead of the original argv
array elements.
OCCAM's dynamic analysis relies on the insight that many applications can consist of two parts: a prefix that processes the deployment context and populates internal program data structures and configuration variables; the suffix consists of the rest of the program that performs the primary computation. For instance, the following is a common pattern:
void parse_input(int argc, char *argv[], struct Config *config){
while ((int c = getopt(argc, argv, "pd")) != -1){
switch (c){
case'p’:
config->enable_plugins = false;
break;
case'd’:
config->enable_debug = false;
break;
}
}
}
int main(int argc, char**argv) {
struct Config *config = ...;
parse_input(argc, argv, config);
...
// main code using config
//
//
}
OCCAM relies heavily on Abstract Interpretation-based static analysis techniques for optimizing and specializing programs. Unfortunately, these techniques struggle with code that processes inputs (such as parse_input
above) since it typically consists of complex loops for which precise, non-disjunctive loop invariants do not exist, necessitating complex string reasoning, among other challenges. OCCAM's approach to mitigate such problems is to combine static and dynamic analysis while trying to maintain soundness guarantees wherever possible.
OCCAM relies on a modified version of the LLVM interpreter lli
to run the program as long as the execution remains completely deterministic. Since the deployment context can be partially specified, the program execution may eventually have to halt because some the value of some data in memory data is not known -- that is, it will only be known at runtime after the specialized application is run. Even with these constraints, OCCAM can often execute a significant number of LLVM instructions before the execution must halt. Once the interpreter is stopped, OCCAM extracts values from the memory contents of the partially completed execution. These are used as constants in the remaining LLVM bitcode when it is sound to do so.
In general, this process can be seen as starting the static analysis of the application with much stronger preconditions obtained from the dynamic analysis of the input processing portion (which is parse_input
in the above example).
For each LLVM module, OCCAM records all the entry points and references to all external symbols and external calls. (Entry points are functions in a module that can be called from other modules.) This and other information is stored in the module's interface. OCCAM constructs this set of interfaces to facilitate sound inference about which functions are internal to each module (under the closed world assumption, describe above).
A function defined in module M
is considered internal if it can only be called from other functions in M
. OCCAM seals a module by explicitly marking each such function as internal. The optimization phase can then remove internal functions if they are not used from within the same module.
OCCAM relies on sea-dsa to compute precise call graphs for each module. sea-dsa is a context-, field-sensitive pointer analysis for LLVM bitcode. sea-dsa statically infers all possible callees of each indirect call. This is key to the generation of precise module interfaces.
This component performs partial evaluation of each module separately. It consists of two main parts:
-
Optimization: simplify and remove dead code. This step relies heavily on LLVM's own transformations and the abstract interpreter clam.
-
Specialization: replace functions with specialized versions if constant values are known for parameters. This step can create new optimization opportunities but it is applied selectively to avoid creating code bloat. OCCAM provides several specialization policies that can be selected from by the user.
This step is similar to the intra-module specialization step but is applied across modules.