Skip to content
Yulei Sui edited this page Jul 13, 2015 · 18 revisions

SVF Design

The framework of SVF is shown in the following figure. The source code of a program is first compiled into bit-code files using clang and then merged together using LLVM Gold Plugin at link time stage (LTO) to produce a whole-program bc file. Then interprocedural pointer analysis is performed to produce points-to information for constructing memory SSA form so that def-use chains are identified for both top and address- taken variables. The generated value-flow information can be used to support various applications (e.g., detecting memory leaks, null pointer detection). The value-flows can also be used to bootstrap more precise pointer analysis for refining value-flows in an iterative manner. framework

Pointer Analysis

Our design separates a pointer analysis implementation into three loosely coupled components i.e. graph, rule, and solver as shown in Figure 1. graph is a higher-level abstraction extracted from LLVM IR, indicating where the pointer analysis algorithm should be performed; rule defines how to derive each constraint on the graph; solver determines in what processing order those constraints are resolved. SVF provides clean and reusable interfaces for users to write their own pointer analysis implementation by combining the three components in a flexible way.

##Value-Flow Construction Based on the points-to information, we first perform a light-weight Mod-Ref Analysis to capture inter-procedural reference and modification side-effects of program variables. Given the mod-ref results and points-to information, the indirect uses and defs at stores, loads and callsites in each procedure are annotated using alias set, which represents a set of abstract memory objects may be indirectly accessed. Unlike intra memory SSA in Open64 and GCC, which compute alias sets of local variables according to intra-procedural pointer analysis and group non-local variables into a single alias set. Mem Region Partition module in SVF provides a flexible ways to allow users to write their own algorithms for partitioning alias sets with inter-procedural side-effects being considered, so that efficiency and precision trade-offs can be made for analyzing large programs.