Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SVFG: Calling a function twice with global pointers as a parameter #10

Open
Symen opened this issue Jan 11, 2017 · 7 comments
Open

SVFG: Calling a function twice with global pointers as a parameter #10

Symen opened this issue Jan 11, 2017 · 7 comments

Comments

@Symen
Copy link

Symen commented Jan 11, 2017

Hi,
I'm currently working with your awesome SVF analysis. During some tests with the SVFG, I found a simple test-case that doesn't make sense to me.

Why does the SVFG in Test1 has a path from "local_x" into the second "init" call?

int *global_p1; // NodeID= 1
int *global_p2; // NodeID= 2

void init(int **pp, int *x) {
	*pp = x;
}

void delete(int **pp) {
	*pp = NULL;
}

void test1() {
	int local_x = 1; // NodeID= 10 
	int local_y = 2; // NodeID= 11

	init(&global_p1, &local_x);
	init(&global_p2, &local_y);
	delete(&global_p2);
}

void test2() {
	int local_x = 1;// NodeID= 10
	int local_y = 2;// NodeID= 11
	int *local_p1;  // NodeID= 12
	int *local_p2;  // NodeID= 13

	init(&local_p1, &local_x);
	init(&local_p2, &local_y);
	delete(&local_p2);
}

int main() {

	/*** Test 1:
	 * There will be a path from the "local_x" into the second call of "init",
        * (10 -> 60 -> 20 -> 16 -> 21 -> 53 -> 21 -> ...)
	 * but "local_x" was only written into "global_p1" and not into "global_p2".
	 */ 
	//test1();

	/*** Test 2:
	 * Writing "local_x" into "local_p1" creates the SVFG, which I would
	 * have expected also in Test1.
	 */ 
	//test2();
	return 0;
}

These are the result I get with: saber -leak -dump-svfg main.bc
SVFG of Test1:
saber_svfg_test1

SVFG of Test2:
saber_svfg_test2

@yuleisui
Copy link
Collaborator

Hi Symen,

Good question!

The differences between your two tests are due to our inter-procedural Mod-Ref analysis. Please refer to line 346-369 in MemRegion.cpp (https://github.com/unsw-corg/SVF/blob/master/lib/MSSA/MemRegion.cpp).

Any global objects and the objects they point to inside a callee function, say F, will be promoted as a side-effect to the callsites that call F (transitively along the call chain).

We treat global objects (e.g., "global_p1" ) and objects (e.g., "local_x") on its Andersen's points-to graph conservatively in a context-insensitive manner, so in your case, "local_x" will be put as a side-effect before the two "init" callsites.

You may wish to take a look at the "SVFG_before_opt" file to see those side-effect objects inside the yellow rectangle boxes.

Thanks

@Symen
Copy link
Author

Symen commented Jan 12, 2017

Thanks for the explanation!

Would a context-sensitive SVFG be possible or is there another data-structure in order to check, if such a path (like in the example above) is possible?

@yuleisui
Copy link
Collaborator

The current Mod-Ref analysis supports partial context-sensitivity. Full context-sensitive Mod-Ref analysis, including handling globals, requires a context-sensitive pointer analysis. SVF does not support it yet. You may wish to write one by leveraging the existing code in MemRegion.cpp.

Another way is to write a pass to refine the SVFG e.g., implementing an optimization pass like SVFGOPT.

We have a recent paper describing how to refine the spurious value-flows. Please refer to http://www.cse.unsw.edu.au/~corg/supa/.

@Symen
Copy link
Author

Symen commented Jan 12, 2017

I am not very experienced with the implementation of static analyses, but I will give it a try.
Which of the two approaches would you prefer?
My guess would be that the leverage of the MemRegion.cpp might be a bit easier. Or am I wrong?

Great paper by the way! This should be helpful.

@yuleisui
Copy link
Collaborator

yuleisui commented Jan 12, 2017

Either writing a pass before (MemRegion.cpp) or after (SVFGOPT.cpp) creating SVFG is doable.

If you have already taken a look at the Mod-Ref analysis, MemRegion.cpp is a good starting point.

If you still want to optimize the graph once it is built, such as removing edges and nodes, then SVFGOPT.cpp is also a good example to follow.

Good luck!

@Symen
Copy link
Author

Symen commented Jan 28, 2017

Unfortunately, I had no time for it the last weeks, but I tried to understand the code in detail. As you mentioned, the framework requires a context-sensitive pointer analysis. I found the CondPTAImpl class (PointerAnalysis.h) which seems to be an approach of a context-sensitive analysis. Wouldn't it be necessary to leverage the Andersen's pointer analysis as you described in this paper Making context-sensitive inclusion-based pointer analysis practical for compilers using parameterised summarisation?

I also was wondering whether SVF allows the implementation of sound analyses, or if there is some implementation detail, that causes analyses to be unsound in principle?

Thanks again

@yuleisui
Copy link
Collaborator

yuleisui commented Jan 31, 2017

On context-sensitivity

Yes, CondPTAImpl is designed for context- or path-sensitive analysis. Every pointer or abstract object is qualified by a condition e.g., (call stack or bdd for path-sensivitiy). To understand more about how to use the data structures in CondPTAImpl, you may wish to refer to “DDA/ContextDDA.h/cpp” in SUPA’s implementation.

Our whole-program context-sensitive inclusion-based pointer analysis was originally implemented in Open64 compiler. We don’t have time to migrate it to SVF:) But the SPE paper gives all the algorithms and evaluation details of the analysis. Yes, it is definitely possible to implement one in SVF.

On soundness

Short answer: Yes, SVF allows sound analysis if every LLVM instruction (relevant to pointer analysis) is modeled in a conservative way.

Long story: soundness in static analysis generally means over-approximations of a program’s behavior. Many research papers claim their approaches are sound by modeling and analyzing the programs (e.g., in the form of inference rules) written in their self-defined tiny languages. However, under real-world setting, a tiny language may be only a subset of a modern intermediate representation, like LLVM IR.

To soundly model the program behavior, SVF need to handle as many instructions as possible (refer to https://github.com/unsw-corg/SVF/wiki/Technical-documentation#123-pagbuilder) to generate high-level representations (e.g., graphs). Another challenge for sound analysis is to model the side-effects of library calls (refer to https://github.com/unsw-corg/SVF/blob/master/lib/Util/ExtAPI.cpp). Of course, C programs which are not ANSI-compliant also pose challenge to soundness (e.g., pointer arithmetic underflow/overflow).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants